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Gene Switches 

Field of the Invention 

This invention relates to molecular gene switches that use molecules capable of binding a 
specific DNA sequence in a ligand-dependent manner where the ligand itself is capable of 
binding DNA. Moreover, this invention relates to methods for the identification of said 
ligand-dependent DNA binding molecules. 

Background to the Invention 



Gene switches are currently of great interest to those wishing to control timing and/or 
" - dosage of gene expression. Various gene switches have been developed in the prior art. 
Most of these prior art ,_switches_are derived, from. gene, .regulatory proteins. In these 
systems, the switching ligand binds to the protein, inducing a protein conformational 
15 change that affects DNA binding. 

It is often the case that a gene's expression is affected by one or more different protein(s). 
Diverse proteins may influence expression of the same gene. Said protein(s) may be 
present in a first cell or cell rype, but these protein(s) may be -absent from a second cell or 

20 cell type. Therefore, a molecule which affects only a single known regulatory protein will 
not have any effect on the expression of the same gene in a cell where this particular 
regulatory protein is not expressed, or is otherwise sequestered. Thus, one of the 
difficulties of the prior art is that a protein-binding switching molecule will have no effect 
on the expression of a gene if the particular protein to which the switching molecule binds 

25 is not present. 

Similarly, a gene's expression may be affected by numerous different proteins in 'different 
cells or cell types. A molecule which affects only a single known regulatory protein will 
not have any effect on the expression of the same gene in a cell in which its expression is 
30 controlled by a different protein or proteins. Therefore, one of the difficulties in the prior 
art is that a plurality of switching molecules may be required in order to modulate or switch 
the expression of a single gene. 
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Therefore, in order to effect switching of gene expression at a gl ven DNA sequence, 
independently of the particular activator protein, it is desirable to target the DNA. Further.' 
custom DNA binding proteins would benefit from switches; if these could be designed to 
interact with DNA, there would be a greater freedom in the design of said proteins. & 

There are numerous polypeptide modifications which are known to affect their interaction 
with a broad spectrum of molecules such as nucleic acids, polypeptides (both intra- and 
mter-molecularly), other macromolecular structures such as membranes, small molecules 
ions, or other entities. Clearly, it is a problem that polypeptide modifications may 
compromise the binding of prior art switching molecules to their polypeptide targets. 

The present invention seeks to overcome such difficulties. 

Aspects of the present invention are set out in the claims and are described below. 
Summary of the Invention 



20 



25 



In a first aspect, the present invention provides a method of selecting a gene switch, which 
gene switch comprises (i) a target DNA molecule; (ii) a DNA binding molecule' which 
binds to the target DNA molecule in a manner modulatable by a DNA binding ligand; and 
(iii) the DNA binding ligand, which method comprises: 

(a) contacting one or more candidate target DNA molecule(s) with one or more 
candidate DNA binding molecules, in the presence of one or more DNA binding ligands, 
wherein at least one of the candidate DNA binding molecules comprises a non-naturally 
occurring DNA binding domain; 

(b) selecting a complex comprising a candidate target DNA, a DNA binding molecule 
and a DNA binding ligand; 

(c) isolating and/or identifying the unknown components of the complex; 

30 (d) comparing the binding of the DNA binding molecule component of the complex to 
the target DNA component of the complex in the presence and absence of the DNA 
binding ligand component of the complex; and 
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(<0 selecting complexes where said bindino differ, in th. 

ONA binding llga nd componen , . * ^ "»> <* ^ 

Preferably the DNA binding . m o,ecu>es are provided as a plurality 0 f DNA „• „■ 

> molecules, more preferab, y as a Hbrary of DNA bindin. molecu es wl " 
u * j • , 0 uloie cuies. Where only one DNA 

b. d,ng molecule Is mcluded in th e screen, the DNA bind Ing molecu.e comprLs a nl 
^occurring DNA b.nding doma.n. Th e term ^ non . natural]y ^ ™ 
nd m om a,n.. means th a, tne DNA binding domam does no, occur i n natU re IveTat 
part of a larger molecule, and has been obtained by delihera,, , ■ 
1 0 novo design techniques. ^ OT * 

Preferably the target DNA is provided as a p,ura lit y of DNA sequences, more preferably as 
- - ^y ^equenc^ing -ted to -one .another by sequlce 



15 



In one embodiment, a p.ura.ity of candidate DNA b.nding ligands are used, ,„ which case 

Preterred to use one target DNA. 

Wy - ««* -mponents isolated and/or tdentif.ed in step (c) Is a DNA binding 
20 ligand component or a DNA binding molecule component. • , < 

In a preferred embodiment of th e firs, aspec, of me .nvent.on, the selected DNA bindino 

T nent has a higher ^ tm *• — dna - - — - - 

^ brndmg hgand componen, man in the absence of the DNA binding ligand component. 

" ■„?; T d DNA b,ndins mo,ecu,e c ~ - * *- - - 

0 X"?^."^ ,he candidate dna bindMg — - ~ -■ ■ 
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The method of the present invention may be used to select a DNA binding molecule which 
binds to a target DNA molecule in a manner modulatable by a DNA binding ligand. 

The method of the present invention may also be used to select a target DNA to which 
5 binds a DNA binding molecule m a manner modulatable by a DNA binding ligand. 

The method of the present invention may also further be used to select a DNA binding 
ligand that modulates binding of a DNA binding molecule to a target DNA. 



0 Generally, the DNA binding ligand and the DNA binding molecule 



are different 



a more 



In a preferred aspect of the invention, said candidate molecules are polypeptides. In 
preferred embodiment, said candidate molecules are polypeptides at least partly derived 
from transcription factors. In an even more preferred embodiment, said candidate 
5 molecules are derived from zinc finger transcription factors. 

Advantageously, the candidate DNA binding molecules are provided as a phage display 
library. 

In a preferred aspect of the invention, the DNA binding ligand is selected from Distamycin 
A, Actinomycin D and echinomycin. 

In another aspect, the invention relates a gene switch comprising (i) a target DNA 
molecule; (ii) a DNA binding molecule which binds to the target DNA molecule in a 
manner modulatable by a DNA binding ligand; and (iii) the DNA binding ligand. In 
particular, the present invention relates to DNA binding molecules and/or DNA binding 
ligands and/or target DNA obtainable by the methods disclosed herein. 

The present invention also provides a method for engineering a novel class of gene 
switches in which a DNA binding ligand affects or modulates the interaction of a DNA 
binding molecule (for example phage displayed polypeptide), with its target DNA. In a 
preferred aspect, the present invention relates to the selection of DNA binding polypeptide 
which recognise a particular DNA sequence or structure. Preferably, said method may 
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■nclude seiect.on of phage displayed polypepddes ^ . 
absence of one or more DNA bind ln2 1 sands ' Of the nl ' V Pr6SenCe ° r 

are selected under these conditions ~ ^ " ^ which 

these conditions, some may bind the DNA with higher ,fr • 
presence of Ugand, whereas others may bind the DNA with hi*he V " ^ 

5 Iigand. lth h,gher afflnif y « the absence of 

The gene switches' and components thereof can be used in method r 

expression. According the presentation also prov dl ^ 

expression of one or more .enes said met. * modulating the 

- ~ - DNA dna 

cell wherein the regulatory sequenc-s o* said „ mVeMi ° n '° 2 

^ nc - s o. said genes comprise a tarset n»i , 
according to the method of the invention. tar get DNA selected 

The.presentinvention-als^vides-nteffiodofn.odniatagthe 
> nucleotide sequences of interest in a host eel, which host cei, " 

sequence capable ofdtrecting the expression of a DNl h d ^ ' 

sequence to which the DNA bi„H' , "* ""k^ ^ 3 ,ar S« DN A 

Ending ligand which mel ^ " 3 ^ » ™* 

& 0 ana wnich method comprises administering said DNA hinr-r ■• 

- wherem the DNA binding molecule is heterologous TZ^*" * 

or female organs of the plant. Preferentially active in the male 

In a further aspect there is provided the use of .ntln-r 

method of the invention in a method of " ^ ^ * 

-prising a targe, DNA to J 1 7"*"" ^ ' 

modulatable by a DNA binding ligand " " 

Also provided is the use of a DNA binding ,i g a„ d S e,ec«ed by the method of the ■ ■ 
■n a method of regulating transcription from a DNA sequenc' com ,nVen " 0 " 

. sequence comprising a target DNA to 
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• DNA binding moIecule bmds ln t marmer modu|ataHe by ^ bMjng 



Also provided „ me use „ f . targe , DNA seIected ^ ^ 

method f t ,, g transcnption ftom a seqiTO m ^ ,n 

whrch a DNA binding molecule bjnds m a mMner moduia(aWe ^ a ^ ^ A ,0 

In another aspect, th e present mvention provides . _ ^ 
omprrsmg , arget DNA ^ and a nucieic acid sequence capab|e rf ^ » 
.0 express, o a DNA binding molecule whlch ^ (q ^ ^ 
mod uIa t a b,e by a DNA blndlng , ig an d w herein the targe , DNA ™ 
acd sequence are heterologous to the organism. 

Preferably the transgenic non-human organism is a plant 

15 

Detailed Description of the Invention 

Definitions 

20 Unless defined otherw.se, a„ technical and: science terms used herein have the same 
meamng as common* understood by one of ordinary s k i„ m the ar, (e.g., ,„ eel, culture 

CT.T* " UCIeic acid chemistt * hybridization - 

Standard techniques are used for molecular a ene tic and hinnh ■ > 

' o-nenc and biochemical methods (see 
generally, Sambrook et al Molecular Pi™;™ a t u 
? c „ . u L Molecular Cloning. A Laboratory Manual, 2d ed. (1989) Cold 

* Pnng Harbor Laboratory Press, Cold Spring Harhor, N.Y. and Ausubel . „ S h rt 
Protocols ,„ Molecular Biology (, 999) 4 * Ed> ]olm & ^ 

—ated herein by reference), chemical methods, pharmaceutic, formulations and 

delivery and treatment of patients. 

30 2 DNA '"T able ^ iS ^ ,0 indiCate ^ bindMg ° f ^ ™ A ^ -lecule to 
DN A b „ d ,ng , lgand can modu , ate> ^ regukte ad . ust ^ or ^ 

DNA binding molecule to the DNA. 
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The term ,so,a,ing. in the co „ Kxt of fte inventjo ^ ref . ^ (o f ^ w 

™re comments or molecules fom , samp|e rf candjdate mo|ecu|es = 
the methods disclosed herein. 



The te™ ^ omplexMs used tQ descnbe ^ assodat . on fcetween a 

molecules as defined herein. 



The erm • SW]tch „ , used herejn to descnbe a mui ^ 

0 target DNA molecule; . DNA binding molecule whlch binds „ ^ " 
■-O.CCU1C u, a manner modu.atab.e by a DNA bindi„ g ligand; mi fte 

gM , The DNA bi „ ding molecuIe may or may not comprise a ; - 

doma.n, esoecialiy when part of ae assay ^ ^ 

~:r r used ,o ' HgDK,e trmscrip,i ° n fr ° m ° ne - m ° re — • «* ^ 

molecuie may need to b e m„ dlfi e d t0 incIude a ^ 
domain, if one is not already present 

The terms "DNA. binding molecule". "DNA bin ding ligand" Md ", a r get DNA" are used 
extens.vely herein. However other tyP es of nucle.c ac lds other than DNA may be reievant 
-0 Consecuently, is intended tha, in genera, the above terms can be replaced with the terms 
nucetc acd bi„ ding mol eco,e". "nucleic acid binding ,i g a nd " and ^ ^ acjd „ 

7 77 L NUC ' eiC genera ' ^ ^ " ° NA ' « sm* 

-anded. RN A is preferably at |eas , partially doub , e . straiided ;n fte contex( ^ . 

mventton. However, in a P referre d aspect of the invention, references to "DNA" mean 
25 deoxyribonucleic acid in a literal sense. ' 

A. DNA binding molecules 

The term 'DNA binding molecule- includes a„ y molecule which is ca P ab,c of bindin* or 
,0 assoc,a„ng with DNA. This binding or assocation m ay be via cova,=n, bondmg, via L ' 
bonding vta hydrogen bonding, via V a ,de,Waa,s bonding, or via any other tvpe of 
reversible or irreversible association. 
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The term -molecule' is used herein to refer to any atom, ion, molecule, macromolecule (for 
example polypeptide), or combination of such entities. The term Tigand' is used 
interchangeably with the term 'molecule'. Molecules according the invention may be free 
m solution, or may be partially or fully immobilised. They may be present as discrete 
entities, or may be complexed with other molecules. Preferably, molecules according to 
the invention include polypeptides displayed on the surface of bacteriophage particles 
More preferably, molecules according to the invention include libraries of polypeptides 
presented as integral parts of the envelope proteins on the outer surface of bacteriophage 
particles. Methods for the production of libraries encoding randomised polypeptides are 
known in the art and may be applied in the present invention. Randomisation mav be total 
or partial; in the case of partial randomisation, the selected codons preferably encode 
options for amino acids, and not for stop codons. 

The term 'candidate DNA binding molecules' is used to describe any one or more 
molecule(s) as defined above which may or may not be capable of binding DNA The 
capability of said molecules to bind DNA may or may not be modulatable by a DNA 
binding ligand. The latter of these properties may be investigated by the methods of this 
invention. Preferably, candidate DNA binding molecules comprise a plurality of. or a 
library of polypeptides. More preferably, these polypeptides are, or are derived from, DNA 
binding proteins such as DNA repair enzymes, polymerases, recombinases, methylases 
restriction enzymes, replication factors, histones, or DNA binding structural proteins such 
as chromosomal scaffold proteins; even more preferably said polypeptides are derived from 
transcription factors. 'Derived from' means that the candidate DNA binding molecules 
preferably comprise one or more of; transcription factors, fragment(s) of transcription 
factors, sequences homologous to transcription factors, or polypeptides which have been 
fully or partially randomised from a starting sequence which is a transcription factor, a 
fragment of a transcription factor, or homologous to a transcription factor Most 
preferably, candidate DNA binding molecules comprise polypeptides which are at least 
40o/o homologous, more preferably at least 60% homologous, even more preferably at least 
7,% homologous or even more, for example 85 %, or 90 %, or even more than 95% 
homologous to one or more transcription factors, using one of the homology calculation 
algorithms defined below. 



Candidate DNA binding molecules may comprise, among other things, DNA binding 
part(s) of any protein(s), for example zinc finger transcription factors, Zif268, ATF family 
transcription factors, ATF1, ATF2, bZIP proteins, CHOP, NF-kB, TATA binding protein 
(TBP), MDM, c-jun, elk, serum response factor (SRF), ternary complex factor (TCF); 
KRUPPEL, Odd Skipped, even skipped and other D.melanogaster transcription factors; 
yeast transcription factors such as GCN4, the GAL family of galactose-inducible 
transcription factors; bacterial transcription factors or repressors such as lacl q , or fragments 
or derivatives thereof. Derivatives would be considered by a person skilled in the art to be 
functionally and/or structurally related to the molecule(s) from which they are derived, for 
example through sequence homology of at least 40%. 

The candidate DNA binding molecules may be non-randomised polypeptides, for example 
'wild-type' or allelic variants of naturally occurring polypeptides, or may be specific 
mutant(s), or may be wholly or partially randomised polypeptides, preferably structurally 
related to DNA binding proteins as described herein. 

In a highly preferred embodiment, these polypeptide candidate DNA binding molecules are 
displayed on the surface of bacteriophage particles, and . are preferably partially randomised 
zinc-finger type transcription factors, preferably retaining at least 40% homology (as 
described herein) to zinc-finger type transcription factors. 

In some cases, sequence homology may be considered in relation to structurally important 
residues, or those residues which are known or suspected of being evolutionarily 
conserved. In such instances, residues known to be variable or non-essential for a 
particular structural conformation may be discounted from the homology calculation. For 
example, as explained herein, zinc fingers are known to have certain residues which are 
important for the formation of the three-dimensional zinc finger structure. In these cases, 
homology may be considered over about seven of said important amino acid residues 
amongst approximately thirty residues which may comprise the whole finger structure. 

As -used herein, the term homology may refer to structural homology. Structural homology 
may be estimated by comparing the structural RMS deviation of the main part of the carbon 
atom backbone of two or more molecules. Preferably, the molecules may be considered 
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structurally homologous if the deviation is 5A or less, preferably 3A or less, more 
preferably 1.5 A or less. Structurally homologous molecules will not necessarily show 
significant sequence homology. 

5 Candidate DNA binding molecules, as defined above, may be prescreened prior to being 
tested in the methods of the invention using routine assays known in art for determining the 
binding of molecules to nucleic acids so as to eliminate molecules that do not bind DNA. 
For example, a candidate DNA binding molecule, preferably a library of candidate DNA 
binding molecules, are contacted with nucleic acid and binding determined. The nucleic 
10 acids may for example be labelled with a detectable label, such as a 
fluorophore/flurochrome, such that after a wash step binding can be determined easily, for 
example by monitoring fluorescence. Other methods for measuring binding to DNA are set 
out in section E. Below. 

15 The nucleic acid with which the candidate binding ligands are contacted may be non- 
specific nucleic acids, such as a random oligonucleotide library or sonicated genomic DNA 
and the like. Alternatively, a specific sequence may be used or partially randomised library 
of sequences. 

20 Preferably, the DNA binding molecules of the invention may bind the target nucleic acid 
with different affinity in the presence or in the absence of ligand. The binding to the 
nucleic acid may be enhanced by the presence of the ligand (i.e. bind with a higher affinity 
in the presence of ligand), or may be reduced in the presence of ligand (i.e. bind with a 
lower affinity in the presence of ligand). In the case where association of the DNA binding 

25 molecule(s) with the target nucleic acid is enhanced by the presence of ligand, said 
association may be additive with the binding of the ligand, or may be synergistic with the 
binding of the ligand, or may affect the binding in another way. If the binding is 
synergistic with the binding of the ligand. said binding may be either wholly or partly 
dependent on the presence of the ligand. Preferably, the characteristics of binding may be 

30 such that the DNA binding molecule(s) may be eluted by addition of an excess of the DNA 
binding ligand. 
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DNA binding molecules according to the invention are preferably polypeptide sequences, 
optionally encoded by nucleic acid sequences. Fragments, mutants, alleles and other 
derivatives of the molecules of the invention preferably retain- substantial homology with 
said sequence(s). As used herein, "homology" means that the two entities share sufficient 
characteristics for the skilled person to determine that they are similar. Preferably, 
homology is used to refer to sequence identity. Thus, the derivatives of said DNA binding 
molecules of the invention preferably retain substantial sequence identity with said, 
molecules. 

In the context of the present invention, a homologous sequence is taken to include any 
sequence which is at least 60, 70, 80 or 90% identical, preferably at least 95 or 98% 
identical over at least 5, preferably 8, 10, 15, 20, 30, 40 or even more residues or bases with 
the molecules (i.e. the sequences thereof) of the. invention,, for- example, as shown in the 
sequence listing herein. In particular, homology should typically be considered with 
respect to those regions of the molecule(s) which may be^ known to be functionally 
important rather than non-essential neighbouring sequences. Although homology can also 
be considered in terms of similarity (i.e. amino acid residues having similar chemical 
properties/functions), in the context of the present invention it is preferred to express 
homology in terms of sequence identity. 

Homology comparisons^ be conducted by eye, or more usually, with the aid of readily 
available sequence comparison programs. These commercially available computer programs 
can calculate % homology between two or more sequences. 

% homology may be calculated over contiguous sequences, i.e. one sequence is aligned with 
the other sequence and each amino acid in one sequence directly compared with the 
corresponding amino acid in the other sequence, one residue at a time. This is called an 
"ungapped" alignment. Typically, such ungapped alignments are performed only over a 
relatively short number of residues (for example less than 50 contiguous amino acids). 

Although this is a very simple and consistent method, it fails to take into consideration that, 
for example, in an otherwise identical pair of sequences, one insertion or deletion will cause 
the following amino acid residues to be put out of alignment thus potentially resulting in a 
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large reduction in % homology when a global alignment is performed. Consequently, most 
sequence comparison methods are designed to produce optimal alignments that take into 
consideration possible insertions and deletions without penalising unduly the overall 
homology score. This is achieved by inserting "gaps'' in the sequence alignment to try to 
5 maximise local homology. 

However, these more complex methods assign :i gap penalties" to each gap that occurs in the 
alignment so that, for the same number of identical amino acids, a sequence alignment with 
as few gaps as possible - reflecting higher relatedness between the two compared sequences - 

10 will achieve a higher score than one with many gaps. "Affine gap costs" are typically used 
that charge a relatively high cost for the existence of a gap and a smaller penalty for each 
subsequent residue in the gap. This is the most commonly used gap scoring system. High 
gap penalties will of course produce optimised alignments with fewer gaps. Most alignment 
programs allow the gap penalties to be modified. However, it is preferred to use the default 

15 values when using such software for sequence comparisons. For example when using the 
GCG Wisconsin Bestfit package (see below) the default gap penalty for amino acid 
sequences is -12 for a gap and -4 for each extension. 

Calculation of maximum % homology therefore firstly requires the production of an optimal 
20 alignment, taking into consideration gap penalties. A suitable computer program for carrying 
out such an alignment is the GCG Wisconsin Bestfit package (University of Wisconsin, 
U.S.A.; Devereux et al. t 1984, Nucleic Acids Research 12:387). Examples of other 
software than can perform sequence comparisons include, but are not limited to, the BLAST 
package (see Ausubel et aL, 1999 ibid- Chapter 18), FASTA (Atschul et al, 1990, J. Mol. 
25 Biol., 403-410) and the GENEWORKS suite of comparison tools. Both BLAST and 
FASTA are available for offline and online searching (see Ausubel et ai, 1999 ibid, pages 
7-58 to 7-60). However it is preferred to use the GCG Bestfit program. 

Although the final % homology can be measured in terms of identity, the alignment process 
30 itself is typically not based on an all-or-nothing pair comparison. Instead, a scaled 
similarity score matrix is generally used that assigns scores to each pairwise comparison 
based on chemical similarity or evolutionary distance. Am example of such a matrix 
commonly used is the BLOSUM62 matrix - the default matrix for the BLAST suite of 



programs. GCG Wisconsin programs generally use either the public default values or a 
custom symbol comparison table if supplied (see user manual for further details). It is 
preferred to use the public default values for the GCG package, or in the case of other 
software, the default matrix, such as BLOSUM62. 

Once the software has produced an optimal alignment, it is possible to calculate % 
homology, preferably % sequence identity. The software typically does this as part of the 
sequence comparison and generates a numerical result. 

DNA binding molecules according to the invention may include any atom, ion, molecule, 
macromolecule (for example polypeptide), or combination of such entities that are capable 
of binding to nucleic acids, such as DNA. Advantageously, molecules according to the 
invention may includejfamilies of polypeptides with known or suspected nucleic acid 
binding motifs. These may include for example zinc finger proteins (see below). 
Molecules according to the invention may also include helix-turn-helix proteins, 
homeodomains, leucine zipper proteins, helix-loop-helix proteins or p -sheet motifs which 
are well known to a person skilled in the art. 

According to the invention, DNA binding motifs of one or more known or suspected 
nucleic acid binding polypeptide(s) may advantageously be randomised, in order to provide 
libraries of candidate nucleic acid binding molecules. 

Crystal structures may advantageously be used in selecting or predicting the relevant DNA 
binding regions of nucleic acid binding proteins by methods known in the art. 

DNA binding regions' of proteins within the same structural family are often conserved or 
homologous to one another, for example zinc finger a-helices, the leucine zipper basic 
region, homeodomain helix 3. 



General considerations and rules governing the binding of several polypeptide families to 
nucleic acids are set out in the literature, e.g. in (Suzuki et aL, 1994:PNAS vol 91 pp 
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12357-61). Nucleic acid binding criteria for zinc fingers as preferred DNA binding 
molecules according to the present invention are set out in this application (see above). 

It is also envisaged that the methods of the present invention could be advantageously 
5 applied to the selection of ligand-modulatable DNA binding molecules from other families 
of transcription factors, for example from the helix-turn-helix (HTH) family and/or from 
the probe helix (PH) family, and/or from the C4 Zinc-binding family (which includes the 
hormone receptor (HR) family), from the Gal4 family, from the c-myb family, from other 
zinc finger families, or from any other family of DNA binding proteins known to one 
10 skilled in the art. 

One or more polypeptides from one or more of these families could be advantageously 
randomised to provide a library of candidate molecules for use in the methods of the 
invention. Preferably, the amino acid residues known to be important for nucleic acid 
15 binding could be randomised. However, it may be desirable to randomise other regions of 
the DNA binding molecule since alterations to the amino acid sequence outside of those 
elements of secondary structure that present amino acids that contact the DNA are likely to 
cause conformational changes that may affect the DNA binding properties of the molecule. 

20 For example, randomisation may involve alteration of zinc finger polypeptides, said 
alteration being accomplished at the DNA or protein level. Mutagenesis and screening of 
zinc finger polypeptides may be achieved by any suitable means. Preferably, the 
mutagenesis is performed at the nucleic acid level, for example by synthesising novel genes 
encoding mutant polypeptides and expressing these to obtain a variety of different proteins. 

25 Alternatively, existing genes can themselves be mutated, such as by site-directed or random 
mutagenesis, in order to obtain the desired mutant genes. 



Mutations may be performed by any method known to those of skill in the art. Preferred, 
however, is site-directed mutagenesis of a nucleic acid sequence encoding the protein of 
30 interest. A number of methods for site-directed mutagenesis are known in the art, from 
methods employing single-stranded phage such as Ml 3 to PCR-based techniques (see 
"PGR Protocols: A guide to methods and applications" , M.A. Innis, D.H. Gelfand, J. J. 
Sninsky v T.J. White (eds.). Academic Press, New York, 1990). Preferably, the 



commercially available Altered Site II Mutagenesis System (Promega) may be employed, 
according to the manufacturer's instructions. 

Randomisation of the zinc finger binding motifs is preferably directed to those amino acid 
residues where the code provided herein gives a choice of residues (see. below). For 
example, positions +1, +5 and +8 are advantageously randomised, whilst preferably 
avoiding hydrophobic amino acids; positions involved in binding to the nucleic acid, 
notably -1, +2, +3 and +6, may be randomised also, preferably within the choices provided 
by the rules of the present invention. 

Screening of the proteins produced by mutant genes is preferably performed by expressing 
the genes and assaying the binding ability of the protein product. A simple and 
advantageously rapid method by which this may be accomplished is by phage display, in 
which the mutant polypeptides are expressed as fusion proteins with the coat proteins of 
filamentous bacteriophage, such as the minor coat protein pll of bacteriophage ml 3 or gene 
III of bacteriophage Fd, and displayed on the capsid of bacteriophage transformed with the 
mutant genes. The target nucleic acid sequence is used as a probe to bind directly. to the 
protein on the phage surface and select the phage possessing advantageous mutants, by 
affinity purification. The phage are then amplified by passage through a. bacterial host, and 
subjected to further rounds of selection and amplification in order to enrich the mutant pool 
for the desired phage and eventually isolate the preferred clone(s). Detailed methodology 
for phage display is known in the art and set forth, for example, in US Patent 5,223,409; 
Choo and Klug, (1995) Current Opinions in Biotechnology 6:431-436; Smith, (1985) 
Science 228:1315-1317; and McCafferty et al. 9 (1990) Nature 348:552-554; all 
incorporated herein by reference. Vector systems and kits for phage display are available 
commercially, for example from Pharmacia. 

Specific peptide ligands such as zinc finger polypeptides may moreover be selected for 
binding to targets by affinity selection using large libraries of peptides linked to the 
C-terminus of the lac repressor Lacl (Cull et al, (1992) Proc Natl Acad Sci USA, 89, 
1865-9). When expressed in E. coli the repressor protein physically links the ligand to the 
encoding plasmid by binding to a lac operator sequence on the plasmid. 
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An entirely in vitro polysome display system has also been reported (Mattheakis et al, 
(1994) Proc Natl Acad Sci U S A, 91, 9022-6) in which nascent peptides are physically 
attached via the ribosome to the RNA which encodes them. Furthermore, polypeptides 
may be partitioned in physical compartments for example wells of an in vitro dish, or 
subcellular compartments, or in small fluid particles or droplets such as emulsions; further 
teachings on this topic may be found in Griffith et aL, (see WO 99/02671). 



A library for use in the invention may be randomised at those positions for which choices 
are given in the rules of the first embodiment of the present invention. The rules set forth 
10 above allow the person of ordinary skill in the art to make informed choices concerning the 
desired codon usage at the given positions. 

The recognition helix of PH family polypeptides contains conserved Arg/Lys residues 
which are important structural elements involved in the binding of phosphates in the 

15 nucleic acid. Base specificity is attributed to amino acids 1, 4, 5 and 8 of the helix. These 
residues could be advantageously varied, for example amino acid 1 could be selected from 
Asn, Asp, His, Val, He to provide the possibility of binding to A, C, G, or T. Similarly, 
amino acid 4 could be selected from Asn, Asp, His, Val, He, Gin, Glu, Arg, Lys, Met, or 
Leu to provide the possibility of binding to A,C,G or T. Preferably, the rules laid out in 

20 (Suzuki et aL, 1994; PNAS vol 91 pp 12357-61) would be used in order to randomise those 
amino acids which affect interaction of the molecule with the nucleic acid, whether in a 
base specific manner, or via binding to the phosphate backbone, thereby producing a 
library of candidate nucleic acid binding molecules for use in the methods of the invention. 

25 Similarly, polypeptide molecules of the helix-turn-helix family could be randomised to 
produce a library of candidate molecules, at least some of which may preferably be capable 
of binding nucleic acid in a ligand-dependent manner when used in the methods of the 
present invention. In particular, amino acids 1, 2, 5 and 6 are known to be conserved and 
function in base-specific nucleic acid binding in HTH motifs. Therefore, at least amino 

30 acids 1, 2, 5 or 6 would preferably be randomised so as to produce molecules for use 
according to the present invention. More preferably, amino acids 1, 5 and 6 could be 
selected from Asn, Asp, His, Val, Ile r Glu, Gin. Arg, Met, Lys or Leu, and amino acid 2 
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could be selected from from Asm Asp,. His, Val, lie, Glu, Gin, Arg, Met, Lys ; Leu, Cys, 
Ser, Thr, or Ala. 

Another family of transcription factors which may be advantageously employed in the 
methods of the current invention are the C4 family which includes hormone receptor type 
transcription factors. It is envisaged that polypeptides of this family could advantageously 
be used to provide candidate molecules for use in "selecting nucleic acid binding molecules 
whose association with nucleic acid is modulatable by a nucleic acid binding ligand. 
Amino acids 1,4,5 and 9 of the C4 motif are known to be involved in contacting the DNA. 
and therefore these residues would preferably be altered to provide a plurality of different 
molecules which may bind DNA in a ligand dependent manner. Preferably, amino acids 
1 and 5 could be selected from from Asn 5 Asp, His, Val, He, Glu, Gin, Arg, Met, Lys or 
Leu, and amino acids 4 and 9 could be selected from Gin, Glu, Arg, Lys, Leu or Met. 

Particularly preferred examples of DNA binding molecules are Cys2-His2 zinc finger 
binding proteins which, as is well known in the art, bind to target nucleic acid sequences 
via a-helical zinc metal atom co-ordinated binding motifs known as zinc fingers. Each 
zinc finger in a zinc finger nucleic acid binding protein is responsible for determining 
binding to a nucleic acid triplet, or an overlapping quadruplet, in a nucleic acid binding 
sequence. Preferably, there are 2 or more zinc fingers, for example 2, 3, 4, 5 or 6 zinc 
fingers, in each binding protein. Advantageously, there are 3 zinc fingers in each zinc 
finger binding protein. 

Thus, in one embodiment, the invention provides a method for preparing a DNA binding 
polypeptide of the Cys2-His2 zinc finger class capable of binding to a target DNA 
sequence, wherein binding is via a zinc finger DNA binding motif of the polypeptide, and 
wherein said binding is modulatable by a DNA binding ligand. 

All of the DNA binding residue positions of zinc fingers, as referred to herein, are 
numbered from the first residue in the a-helix of the finger, ranging from +1 to '+9. 
refers to the residue in the framework structure immediately preceding the a-helix in a 
Cys2-His2 zinc finger polypeptide. Residues referred to as "++" are residues present in an 
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adjacent (C-terminal) finger. Where there is no C-terminal adjacent finger, 
interactions do not operate. 

The present invention is in one aspect concerned with the production of what are 
5 essentially artificial DNA binding proteins. In these proteins, artificial analogues of amino 
acids may be used, to impart the proteins with desired properties or for other reasons. 
Thus, the term " amino acid", particularly in the context where i: any amino acid" is referred 
to, means any sort of natural or artificial amino acid or amino acid analogue that may be 
employed in protein construction according to methods known in the art. Moreover, any 
10 specific amino acid referred to herein may be replaced by a functional analogue thereof, 
particularly an artificial functional analogue. The nomenclature used herein therefore 
specifically comprises within its scope functional analogues or mimetics of the defined 
amino acids. 

15 The a-helix of a zinc finger binding protein aligns antiparallel to the nucleic acid strand, 
such that the primary nucleic acid sequence is arranged 3' to 5' in order to correspond with 
the N terminal to C-terminal sequence of the zinc finger. Since nucleic acid sequences are 
conventionally written 5 5 to 3', and amino acid sequences N-terminus to C-terminus, the 
result is that when a nucleic acid sequence and a zinc finger protein are aligned according 

20 to convention, the primary interaction of the zinc finger is with the - strand of the nucleic 
acid, since it is this strand which is aligned 3' to 5'. These conventions are followed in the 
nomenclature used herein. It should be noted, however, that in nature certain fingers, such 
as finger 4 of the protein GLI, bind to the + strand of nucleic acid: see Suzuki et al, (1994) 
NAR 22:3397-3405 and Pavletich and Pabo, (1993) Science 261:1701-1707. The 

25 incorporation of such fingers into DNA binding molecules according to the invention is 
envisaged. 



The present invention may be integrated with the rules set forth for zinc finger polypeptide 
design in our copending European or PCT patent applications having publication numbers; 
30 WO 98/53057, WO 98/53060, WO 98/53058, WO 98/53059, describe improved 
techniques for designing zinc finger polypeptides capable of binding desired nucleic acid 
sequences. In combination with selection procedures, such as phage display, set forth for 
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example in WO 96/06 166, these techniques enable the production of zinc finger 
polypeptides capable of recognising practically any desired sequence. 

In a preferred aspect, therefore, the invention provides a method for preparing a DNA 
binding polypeptide of the Cys2-His2 zinc finger class capable of binding to a target DNA 
sequence, wherein said binding is modulatable by a DNA binding ligand, and wherein 
binding to each base of the triplet by an a-helical zinc finger DNA binding motif in the 
polypeptide is determined as follows: 

a) if the 5' base in the triplet is G, then position +6 in the a-helix is Arg and/or position 

is Asp; 

b) if the 5' base in the triplet is A 5 then position +6 in the a-helix is Gin or Glu and ++2 is 
not Asp; 

c) if the 5' base in the triplet is T, then position +6 in the a-helix is Ser or Thr and 
position ++2 is Asp; or position +6 is a hydrophobic amino acid other than Ala; 

d) if the 5' base in the triplet is C, then position +6 in the a-helix may be any amino acid, 
provided that position ++2 in the a-helix is not Asp; 

e) if the central base in the triplet is G, then position +3 in the a-helix is His; 

f) if the central base in the triplet is A, then position +3 in the a-helix is Asn; 

g) if the central base in the triplet is T, then position +3 in the a-helix is Ala, Ser, lie, Leu, 
Thr or Val; provided that if it is Ala, then one of the residues at -1 or +6 is a small 
residue; 

h) if the central base in the triplet is 5-meC, then position +3 in the a-helix is Ala, Ser, He, 
Leu, Thr or Val; provided that if it is Ala, then one of the residues at -1 or +6 is a small 
residue; 

i) if the 3' base in the triplet is G, then position -1 in the a-helix is Arg; 

j) if the 3 5 base in the triplet is A, then position -1 in the a-helix is Gin and position +2 is 
- Ala; . 

k) if the 3' base in the triplet is T. then position -1 in the a-helix is Asn; or position -1 is 
Gin and position +2 is Ser; 

1) if the 3' base in the triplet is C s then position -1 in the a-helix is Asp and Position +1 is 
Arg: where the central residue of a target triplet is 'C, the use of Asp at position +3 of a 
zinc finger polypeptide allows preferential binding to C over 5-meC. 
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The foregoing represents a set of rules which permits the design of a zinc finger binding 
protein specific for any given target DNA sequence. 

5 A zinc finger binding motif is a structure well known to those in the art and defined in, for 
example, Miller et ■ ai, (1985) EMBO J. 4:1609-1614; Berg (1988) PNAS (USA) 
85:99-102; Lee et at., (1989) Science 245:635-637; see International patent applications 
WO 96/06166 and WO 96/32475, corresponding to USSN 08/422,107, incorporated herein 
by reference. 

10 

In general, a preferred zinc finger framework has the structure: 

(A) X 0 -2 C Xj_5 C X 9 . 14 H X 3 - 6 H / c 

15 where X is any amino acid, and the numbers in subscript indicate the possible numbers of 
residues represented by X. 

In a preferred aspect of the present invention, zinc finger nucleic acid binding motifs may 
be represented as motifs having the following primary structure: 

20 

(B) X a C X 2 _ 4 C X 2 _ 3 FX c XXXXLXXHXXX b H - linker 
-1 123456789 

wherein X (including X a , X b and X c ) is any amino acid. X 2 . 4 and X 2 . 3 refer to the presence 
25 of 2 or 4, or 2 or 3, amino acids, respectively. The Cys and His residues, which together 
co-ordinate the zinc metal atom, are marked in bold text and are usually invariant, as is the 
Leu residue at position +4 in the a-helix. 

Modifications to this representation may occur or be effected without necessarily 
30 abolishing zinc finger function, by insertion, mutation or deletion of amino acids. For 
example it is known that the second His residue may be replaced by Cys (Krizek et al, 
(1991) J. Am. Chem. Soc. 113:4518-4523) and that Leu at +4 can in some circumstances 



be replaced with Arg. The Phe residue before X c may be replaced by any aromatic other 
than Trp. Moreover, experiments have shown that departure from the preferred structure 
and residue assignments for the zinc finger are tolerated and- may even prove beneficial in 
binding to certain nucleic acid sequences. Even taking this into -account, however, the 
general structure involving an a-helix co-ordinated by a zinc atom which contacts four Cys 
or His residues, does not alter. As used herein, structures (A) and (B) above are taken as an 
exemplary structure representing ail zinc finger structures of the Cys2-His2 type. 

Preferably, X a is F / Y -X or P- F / Y -X. In this context, X is any amino acid. Preferably, in this 
context X is E, K, T or S. Less preferred but also envisaged are Q, V, A and P. The 
remaining amino acids remain possible. 

Preferably, X 2 _4 consists of two amino acids rather than four. The first of these amino acids 
may be any amino acid, but S, E 5 K, T, P and R are preferred. Advantageously, it is P or R. 
The second of these amino acids is preferably E, although any amino acid may be used. 

Preferably, X b is T or I. Preferably, X c is S or T. 

Preferably, X 2 . 3 is G-K-A, G-K-C, G-K-S or G-K-G. However, departures from the 
preferred residues are possible, for example in the form of M-R-N or M-R. 

Preferably, the linker is T-G-E-K or T-G-E-K-P. 

As set out above, the major binding interactions occur with amino acids -1. +3 and +6. 
Amino acids +4 and +7 are largely invariant. The remaining amino acids may be 
essentially any amino acids. Preferably, position +9 is occupied by Arg or Lys. 
Advantageously, positions +1, +5 and +8 are not hydrophobic amino acids, that is to say 
are not Phe, Trp or Tyr. Preferably, position ++2 is any amino acid, and preferably serine, 
save where its nature is dictated by its role as a +-f2 amino acid for an N-terminal zinc 
finger in the same nucleic acid binding molecule. 
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In a most preferred aspect, therefore, bringing together the above, the invention allows the 
definition of every residue in a zinc finger DNA binding motif which will bind specifically 
to a given target DNA triplet. 

5 The code provided by the present invention is not entirely rigid; certain choices are 
provided. For example, positions +1, +5 and +8 may have any amino acid allocation, 
whilst other positions may have certain options: for example, the present rules provide that, 
for binding to a central T residue, any one of Ala, Ser or Val may be used at +3. In its 
broadest sense, therefore, the present invention provides a very large number of proteins 

10 which are capable of binding to every defined target DNA triplet. 

Preferably, however, the number of possibilities 'may be significantly reduced. For 
example, the non-critical residues +1, +5 and +8 may be occupied by the residues Lys, Thr 
and Gin respectively as a default option. In the case of the other choices, for example, the 
1 5 first-given option may be employed as a default. Thus, the code according to the present 
invention allows the design of a single, defined polypeptide (a "default" polypeptide) 
which will bind to its target triplet. 

In a further aspect of the present invention, there is provided a method for preparing a DNA 
20 binding protein of the Cys2-His2 zinc finger class capable of binding to a target DNA 
sequence in a manner modulatable by a DNA binding ligand, comprising the steps of: 

a) selecting a model zinc finger domain from the group consisting of naturally occurring 
zinc fingers and consensus zinc fingers; and 

25 

b) mutating at least one of positions -1, +3, +6 (and ++2) of the finger as required by a 
method according to the present invention. 

In general, naturally occurring zinc fingers may be selected from those fingers for which 
30 the DNA binding specificity is known. For example, these may be the fingers for which a 
crystal structure has been resolved: namely Zif 268 (Elrod-Erickson et aL, (1996) Structure 
4:1171-1180), GLI (Pavletich and Pabo, (1993) Science 261:1701-1707), Tramtrack 



(Fairall et aL, (1993) Nature 366:483-487) and YY1 (Houbaviy et al. 9 (1996) PNAS (USA) 
93:13577-13582). 



The naturally occurring zinc finger 2 in Zif 268 makes an excellent starting point from 
which to engineer a zinc finger and is preferred. 

Consensus zinc finger structures may be prepared by comparing the sequences of known 
zinc fingers, irrespective of whether their binding domain is known. Preferably, the 
consensus structure is selected from the group consisting of the consensus structure P Y K 
CPECGKSFS-QKSDLVKHQRTHTG, and the consensus structure PYKCS 
ECGKAFSQKSNLTRHQRIHTGEKP. 

The consensuses are derived from the consensus provided by YLiizek'et al, (1991) J. Am. 
Chem. Soc. 113: 4518-4523 and from Jacobs, (1993) PhD thesis, University of 
Cambridge, UK. In both cases, the linker sequences described above for joining two zinc 
finger motifs together, namely TGEK or TGEKP can be formed on the ends of the 
consensus. Thus, a P may be removed where necessary, or, in the case of the consensus 
terminating T G, E K (P) can be added. . 

When the nucleic acid specificity of the model finger selected is known, the mutation of the 
finger in order to modify its specificity to bind to the target DNA may be directed to 
residues known to affect binding to bases at which the natural and desired targets differ. 
Otherwise, mutation of the model fingers should be concentrated upon residues -1, +3, +6 
and ++2 as provided for in the foregoing rules. 

In order to produce a binding protein having improved binding, moreover, the rules 
provided by the present invention may be supplemented by physical or virtual modelling of 
the protein/DNA interface in order to assist in residue selection. 

In a second embodiment, the invention provides a method for producing a zinc finger 
polypeptide capable of binding to -a target DNA sequence, wherein said binding is 
modulatable by a DNA binding ligand, comprising: 



a) providing a nucleic acid library encoding a repertoire of zinc finger polypeptides, the 
nucleic acid members of the library being at least partially randomised at one or more 
of the positions encoding residues -L 2, 3 and 6 of the a-helix of the zinc finger 
polypeptides; 

b) displaying the library in a selection system and screening it against a target DNA 
sequence; 

c) isolating the nucleic acid members of the library encoding zinc finger polypeptides 
capable of binding to the target sequence in the presence/absence of DNA binding 
ligand; 

d) selecting those members of the library isolated in (c) which bind the target nucleic acid 
sequence with different affinities in the presence and absence of the DNA binding 
ligand. 

Methods for the production of libraries encoding randomised polypeptides are known in the 
art and may be applied in the present invention. Randomisation may be total, or partial; in 
the case of partial randomisation, the selected codons preferably encode options for amino 
acids as set forth in the rules above. 

Zinc finger polypeptides may be designed which specifically bind to nucleic acids 
incorporating the base U, in preference to the equivalent base T. 

. In a further preferred aspect, the invention comprises a method for producing a zinc finger 
polypeptide capable of binding to a target DNA sequence, wherein said binding is 
modulatable by a DNA binding ligand, comprising: 

a) providing a nucleic acid library encoding a repertoire of zinc finger polypeptides each 
possessing more than one zinc fingers, the nucleic acid members of the library being at 
least partially randomised at one or more of the positions encoding residues -ly 2, 3 
and 6 of the a-helix in a first zinc finger and at one or more of the positions encoding 
residues -1, 2. 3 and 6 of the a-helix in a further zinc finger of the zinc finger 
polypeptides; 

b) displaying the library in a selection system and screening it against a target DNA. 



c) assessing the affinity of the DNA binding molecules for the target DNA in the 
presence and absence of the DNA binding ligand. and 

d) isolating the nucleic acid members of the library encoding zinc finger polypeptides 
capable of binding to. the target sequence with different affinities in the presence and 
absence of DNA binding ligand. 

In this aspect, the invention encompasses library technology described in our copending 
International patent application WO 98/53057, incorporated herein by reference in its 
entirety. WO 98/53057 describes the production of zinc finger polypeptide libraries in 
which each individual zinc finger polypeptide comprises more than one, for example two 
or three, zinc fingers; and wherein within each polypeptide partial randomisation occurs in 
at least two zinc fingers. 

This allows for the selection of the "overlap" specificity, wherein, within each triplet, the 
choice of residue for binding to the third nucleotide (read 3' to 5' on the + strand) is 
influenced by the residue present at position +2 on the subsequent zinc finger, which 
displays cross-strand specificity in binding. The selection of zinc finger polypeptides 
incorporating cross-strand specificity of adjacent zinc fingers enables the selection of 
nucleic acid binding proteins more quickly, and/or with a higher degree of specificity than 
is otherwise possible. 

Zinc finger binding motifs designed according to the invention may be combined into 
nucleic acid binding polypeptide molecules having a multiplicity of zinc fingers. 
Preferably, the proteins have at least two zinc fingers. In nature, zinc finger binding 
proteins commonly have at least three zinc fingers, although two-zinc finger proteins such 
as Tramtrack are known. The presence of at least three zinc fingers is preferred. Nucleic 
acid binding proteins may be constructed by joining the required fingers end to end, 
N-terminus to C-terminus. Preferably, this is effected by joining together the. relevant 
nucleic acid sequences which encode the zinc fingers to produce a composite nucleic acid 
coding sequence encoding the entire binding protein. The invention therefore provides a 
method for producing a DNA binding protein as defined above, wherein the DNA binding 
protein is constructed by recombinant DNA technology, the method comprising the steps 
of: 
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a) preparing a nucleic acid coding sequence encoding two or more zinc finger binding 
motifs as defined above, placed N-terminus to C-terminus; 

b) inserting the nucleic acid sequence into a suitable expression vector; and 

5 c) expressing the nucleic acid sequence in a host organism in order to obtain the DNA 
binding protein. 

A "leader" peptide may be added to the N-terminal finger. Preferably, the leader peptide is 
MAEEKP. 



10 



B. Nucleic acid vectors encoding DNA binding proteins 



A nucleic acid encoding the DNA binding protein according to the invention can be 
incorporated into vectors for further manipulation. As used herein, vector (or plasmid) 

1 5 refers to discrete elements that are used to introduce heterologous nucleic acid into cells for 
either expression or replication thereof. Selection and use of such vehicles are well within 
the skill of the person of ordinary skill in the art. Many vectors are available, and selection 
of appropriate vector will depend on the intended use of the vector, i.e. whether it is to be 
used for DNA amplification or for nucleic acid expression, the size of the DNA to be 

20 inserted into the vector, and the host cell to be transformed with the vector. Each vector 
contains various components depending on its function (amplification of DNA or 
expression of DNA) and the host cell for which it is compatible. The vector components 
generally include, but are not limited to, one or more of the following: an origin of 
replication, one or more marker genes, an enhancer element, a promoter, a transcription 

25 termination sequence and a signal sequence. 

Both expression and cloning vectors generally contain nucleic acid sequence that enable 
the vector to replicate in one or more selected host cells. Typically in cloning vectors, this 
sequence is one that enables the vector to replicate independently of the host chromosomal 
30 DNA, and incLudes origins of replication or autonomously replicating sequences. Such 
sequences are well known for a variety of bacteria, yeast and viruses. The origin of 
replication from the plasmid pBR322 is suitable for most Gram-negative bacteria, the 2u 
plasmid origin is suitable for yeast, and various viral origins (e.g. SV40, polyoma, 



adenovirus) are useful for cloning vectors in mammalian cells. Generally, the origin of 
replication component is not needed for mammalian expression vectors unless these are 
used in mammalian cells competent for high level DNA replication, such as COS cells. 

Most expression vectors are shuttle vectors, i.e. they are capable of replication in at least 
one class of organisms but can be transfected into another class of organisms for 
expression. For example, a vector is cloned in E. coli and then the same vector is 
transfected into yeast, mammalian or plant cells even though it is not capable of replicating 
independently of the host cell chromosome. DNA may also be replicated by insertion into 
the host genome. However, the recovery of genomic DNA encoding the DNA binding 
protein is more complex than that of episomally replicated vector because restriction 
enzyme digestion is required to excise DNA binding protein DNA. DNA can be amplified 
by PCR and be directly transfected into the host cells without any replication component. 

Advantageously, an expression and cloning vector may contain a selection gene also 
referred to as selectable marker. This gene encodes a protein necessary for the survival or 
growth of transformed host cells grown in a selective culture medium. Host cells not 
transformed with the vector containing the selection gene will not survive in the culture 
medium. Typical selection genes encode proteins that confer resistance to antibiotics and 
other- toxins, e.g. ampicillin, neomycin, methotrexate or tetracycline, complement 
auxotrophic deficiencies, or supply critical nutrients not available from complex media. 

Selectable markers which may be used in fungal cells, for example yeast cells, include 
wild-type genes which complement auxotrophic defects in for example the Uracil (eg. 
URA3 gene), Lysine (eg. LYS2 gene), Adenine (eg. ADE2 gene), Methionine (eg. MET3 
gene), Histidine (eg. HIS3 gene), Tryptophan (eg. TRP1 gene), Leucine (eg. LEU2 gene) or 
other metabolic pathways. In addition, counter-selection methods are 1 well known in the 
art. These enable genes to be selected against by the action of a chemical precursor which 
is harmless unless convened to a toxic product by the action of one or more gene(s). 
Examples of these include; 5-fluoro-orotic acid, which is converted to a toxic compound by 
the action of the URA3 gene product; a-amino-adipic acid, which is converted to a toxic 
compound by the LYS2 gene product; allyl alcohol, which is converted to a toxic 
compound by alcohol dehydrogenase activity as encoded by the ADH genes, or any other 
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suitable selective regime known to those skilled in the art. Other selective markers are 
based on the expression of a gene in a fungus such as yeast which overcomes the metabolic 
arrest induced by, or toxicity of, a chemical entity which may be added to the growth 
medium or otherwise presented to the cells. Examples of these may include the KAN 
gene(s) which confer resistance to antibiotics such as G-148, the HISS gene which confers 
resistance to 3-amino-triazole, or the ADH2 gene which can confer resistance to heavy 
metal ions such as cadmium, or any other suitable genes which confer resistance to toxic or 
growth arresting regimes. 

Since the replication of vectors is conveniently done in E. coli, an E coli genetic marker 
and an E. coli origin of replication are advantageously included. These can be obtained 
from E. coli plasmids, such as pBR322, Bluescript© vector or a pUC plasmid, e.g. pUC18 
or pUC19, which contain both E. coli replication origin and E coli genetic marker 
conferring resistance to antibiotics, such as ampicillin. 

Suitable selectable markers for mammalian cells are those that enable the identification of 
cells competent to take up DNA binding protein nucleic acid, such as dihydrofolate 
reductase (DHFR, methotrexate resistance), thymidine kinase, or genes conferring 
resistance to G418 or hygromycin. The mammalian cell transformants are placed under 
selection pressure which only those transformants which have taken up and are expressing 
the marker are uniquely adapted to survive. In the case of a DHFR or glutamine synthase 
(GS) marker, selection pressure can be imposed by culturing the transformants under 
conditions in which the pressure is progressively increased, thereby leading to 
amplification (at its chromosomal integration site) of both the selection gene and the linked 
DNA that encodes the DNA binding protein. Amplification is the process by which genes 
in greater demand for the production of a protein critical for growth, together with closely 
associated genes which may encode a desired protein, are reiterated in tandem within the 
chromosomes of recombinant cells. Increased quantities of desired protein are usually 
synthesised from thus amplified DNA. 

Expression and cloning vectors usually contain a promoter that is recognised by the host 
organism and is operably linked to nucleic acid encoding DNA binding protein. Such a 
promoter may be inducible or constitutive. The promoters are operably linked to DNA 



encoding the DNA binding protein by removing the promoter from the source DNA by 
restriction enzyme digestion and inserting the isolated promoter sequence into the vector. 
Both the native DNA binding protein promoter sequence and many heterologous promoters 
may be used to direct amplification and/or expression of DNA binding protein encoding 
DNA. 

Promoters suitable for use with prokaryotic hosts inciude, for example, the (3-lactamase and 
lactose promoter systems, alkaline phosphatase, the tryptophan (trp) promoter system and 
hybrid promoters such as the tac promoter. Their nucleotide sequences have been 
published, thereby enabling the skilled worker operably to ligate them to DNA encoding 
DNA binding protein, using linkers or adapters to supply any required restriction sites. 
Promoters for use in bacterial systems will also generally contain a Shine-Delgarno 
sequence operably linked to the DNA encoding the DNA binding protein. 

Preferred expression vectors are bacterial expression vectors which comprise a promoter of 
■ a bacteriophage such as phagex or T7 which is capable of functioning in the bacteria. In 
one of the most widely used expression systems, the nucleic acid encoding the fusion 
protein may be transcribed from the vector by T7 RNA polymerase (Studier et al, Methods 
in Enzymol. 185; 60-89, 1990). In the E. coli BL21(DE3) host strain, used in conjunction 
with pET vectors, the T7 RNA polymerase is produced from the p-lysogen DE3 in the host 
bacterium, and its expression is under the control of the IPTG inducible lac UV5 promoter. 
This system has been employed successfully for over-production of many proteins. 
Alternatively the polymerase gene may be introduced on a lambda phage by infection with 
an int- phage such as the CE6 phage which is commercially available (Novagen, Madison, 
USA). Other vectors include vectors containing the lambda PL promoter such as PLEX 
(Invitrogen, NL), vectors containing the trc promoters such as pTrcHisXpressTm 
(Invitrogen) or pTrc99 (Pharmacia Biotech, SE) or vectors containing the tac promoter 
such as pKK223-3 (Pharmacia Biotech) or PMAL (New England Biolabs, MA, USA), 

Moreover, the DNA binding protein gene according to the invention preferably includes a 
secretion sequence in order to facilitate secretion of the polypeptide from bacterial hosts, 
- such that it will be produced as a soluble native peptide rather than in an inclusion body. 
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The peptide may be recovered from the bacterial periplasmic space, or the culture medium, 
as appropriate. 

Suitable promoting sequences for use with yeast hosts may be regulated or constitutive and 
5 are preferably derived from a highly expressed yeast gene, especially a Saccharomyces 
cerevisiae gene. Thus, the promoter of the TRP1 gene, the ADHI or ADHII gene, the acid 
phosphatase (PH05) gene, a promoter of the yeast mating pheromone genes coding for the 
a- or a-factor or a promoter derived from a gene encoding a glycolytic enzyme such as the 
promoter of the enolase, glyceraldehyde-3 -phosphate dehydrogenase (GAPDH), 3-phospho 
10 glycerate kinase (PGK), hexokinase, pyruvate decarboxylase, phosphofructokinase, 
glucose-6-phosphate isomerase, 3-phosphoglycerate mutase, pyruvate kinase, triose 
phosphate isomerase, phosphoglucose isomerase or glucokinase genes, or a promoter from 
the TATA binding protein (TBP) gene can be used. Furthermore, it is possible to use 
hybrid promoters comprising upstream activation sequences (UAS) of one yeast gene and 
1 5 downstream promoter elements including a functional TATA box of another yeast gene, 
for example a hybrid promoter including the UAS(s) of the yeast PH05 gene and 
downstream promoter elements including a functional TATA box of the yeast GAP gene 
(PH05-GAP hybrid promoter). A suitable constitutive PH05 promoter is e.g. a shortened 
acid phosphatase PH05 promoter devoid of the upstream regulatory elements (UAS) such 
20 as the PH05 (-173) promoter element starting at nucleotide -173 and ending at nucleotide 
-9 of the PH05 gene. 

DNA binding protein gene transcription from vectors in mammalian hosts may be 
controlled by promoters derived from the genomes of viruses such as polyoma virus, 
25 adenovirus, fowlpox virus, bovine papilloma virus, avian sarcoma virus, cytomegalovirus 
(CMV), a retrovirus and Simian Virus 40 (SV40), from heterologous mammalian 
promoters such as the actin promoter or a very strong promoter, e.g. a ribosomal protein 
promoter, and from the promoter normally associated with DNA binding protein sequence, 
provided such promoters are compatible with the host cell systems. 



30 



Transcription of a DNA encoding DNA binding protein by higher eukaryotes may be 
increased by inserting an enhancer sequence into the vector. Enhancers are relatively 
orientation and oosition independent. Many enhancer sequences are known from 
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mammalian genes (e.g. elastase and globin). However, typically one will employ an 
enhancer from a eukaryotic cell virus. Examples include the SV40 enhancer on the late 
side of the replication origin (bp 100-270) and the CMV early promoter enhancer. The 
enhancer may be spliced into the vector at a position 5' or 3' to DNA binding protein DNA, 
5 but is preferably located at a site 5' from the promoter. 

Advantageously, a eukaryotic expression vector encoding a DNA binding protein 
according to the invention may comprise a locus control region (LCR). LCRs are capable 
of directing high-level integration site independent expression of transgenes integrated into 
10 host cell chromatin, which is of importance especially where the DNA binding protein gene 
is to be expressed in the context of a permanently-transfected eukaryotic . cell line in which 
chromosomal integration of the vector has occurred, or in transgenic animals. 

Eukaryotic vectors may also contain sequences necessary for the termination of 
15 transcription and for stabilising the mRNA. Such sequences are commonly available from 
the 5' and 3' untranslated regions of eukaryotic or viral DNAs or cDNAs. These regions 
contain nucleotide segments transcribed as polyadenylated fragments in the untranslated 
portion of the mRNA encoding DNA binding protein. 

20 An expression vector includes any vector capable of expressing DNA binding protein 
nucleic acids that are operatively linked with regulatory sequences, such as promoter 
regions, that are capable of expression of such DNAs. Thus, an expression vector refers to 
a recombinant DNA or RNA construct, such as a plasmid, a phage, recombinant virus or 
other vector, that upon introduction into an appropriate host cell, results in expression of 

25 the cloned DNA. Appropriate expression vectors are well known to those with ordinary 
skill in the art and include those that are replicable in eukaryotic and/or prokaryotic cells 
and those that remain episomal or those which integrate into the host cell genome. For 
example, DNAs encoding DNA binding protein may be inserted into a vector suitable for 
expression of cDNAs in mammalian cells, e.g. a CMV enhancer-based vector such as 

30 pEVRF (Matthias, et al, (1989) NAR 17, 6418). 

In a preferred embodiment, the DNA binding protein constructs of -the invention are 
expressed in plant cells under the control of transcriptional regulatory sequences that are 



P006552GB* 



-32- 

known to function in plants. The regulatory sequences selected will depend on the required 
temporal and spatial expression pattern of the DNA binding protein in the host plant. 
Many plant promoters have been characterized and would be suitable for use in 
conjunction with the invention. By way of illustration, some examples are provided below: 

5 

A large number of promoters are known in the art which direct expression in specific 
tissues and organs (e.g. roots, leaves, flowers) or in cell types {e.g. leaf epidermal cells, leaf 
mesophyll cells, root cortex cells). For example, the maize PEPC promoter from the 
phosphoenol carboxylase gene (Hudspeth & Grula Plant Mol. Bio. 12: 579-589 (1989)) is 
10 green tissue-specific; the trpA gene promoter is pith cell-specific (WO 93/07278 to Ciba- 
Geigy); the TA29 promoter is pollen-specific (Mariani et al. Nature 347: 737-741 (1990): 
Mariani et al. Nature 357: 384-387 (1992)). 

Other promoters direct transcription under conditions of presence of light or absence or 
15 light or in a circadian manner. For example, the GS2 promoter described by Edwards and 
Coruzzi, Plant Cell 1: 241-248 (1989) is induced by light, whereas the AS1 promoter 
described by Tsai and Coruzzi, EMBO J 9: 323-332 (1990) is expressed only in conditions 
of darkness. 

20 Other promoters are wound-inducible and typically direct transcription not just on wound 
induction, but also at the sites of pathogen infection. Examples are described by Xu et al. 
(Plant Mol. Biol. 22: 573-588 (1993)); Logemann ef a/. (Plant Cell 1: 151-158 (1989)); and 
Firek et al. (Plant Mol Biol 22: 129-142 (1993)). 

25 A number of constitutive promoters can be used in plants. These include the Cauliflower 
Mosaic Virus 35S promoter (US 5,352,605 and US 5,322,938, both to Monsanto) including 
minimal promoters (such as the -90 or -46 CaMV 35S promoter) linked to other regulatory 
sequences, the rice actin promoter (McElroy et al. Mol. Gen. Genet. 231, : 150-160 (1991)), 
and the maize and sunflower ubiquitin promoters (Christensen et al. Plant Mol Biol. 12: 

30 619-632 (1989); Binet et al. Plant Science 79: 87-94 (1991)). 

Using promoters that direct transcription in the plant species of interest, the DNA binding 
protein of the invention can be expressed in the required ceil or tissue types. For example. 



if it is the intention to utilize the DNA binding protein to regulate a gene in a specific cell 
or tissue type, then the appropriate promoter can be used to direct expression of the DNA 
binding protein construct. 

An appropriate terminator of transcription is fused downstream of the selected DNA 
binding protein containing transgene and any of a number of available terminators can be 
used in conjunction with the invention. Examples of transcriptional terminator sequences 
that are known to function in plants include the nopaline synthase terminator found in the 
pBI vectors (Clontech catalog 1993/1994), the E9 terminator from the rbcS gene (ref), and 
the tml terminator from Cauliflower Mosaic Virus. 

A number of sequences found within the transcriptional unit are known to enhance gene 
expression and these can be used within the context of the current invention. Such 
sequences include intron sequences which, particularly in monocotyledonous cells, are 
known to enhance expression. Both intron 1 of the maize Adhl gene and the intron from 
the maize bronze 1 gene have been found to be effective in enhancing expression in maize 
cells (Callis et al Genes Develop. 1: 1183-1200 (1987)) and intron sequences are 
frequently incorporated into plant transformation vectors, typically within the non- 
translated leader. ~ 

A number of virus-derived non-translated leader sequences have been found to enhance 
expression, especially in dicotyledonous cells. Examples include the "Q" leader sequence 
of Tobacco Mosaic Virus, and simlar leader sequences of Maize Chlorotic Mottle Virus 
and Alfalfa Mosaic Virus (Gallie et al Nucl. Acids Res. 15: 8693-871 1 (1987); Shuzeski et 
al Plant Mol Biol, 15: 65-79 (1990)). 

The DNA binding proteins of the current invention are targeted to the cell nucleus so that 
they are able to interact with host cell DNA and bind to the appropriate DNA target in the 
nucleus and regulate transcription. To effect this, a Nuclear Localization Sequence (NLS) 
is incorporated in frame with the expressible zinc finger construct. The NLS can be fused 
either 5' or 3' to the zinc finger encoding sequence. 
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The NLS of the wild-type Simian Virus 40 Large T- Antigen (Kalderon et al. Cell 37: 801- 
813 (1984); Markland et al. Mol. Cell Biol. 7: 4255-4265 (1987)) is an appropriate NLS 
and has previously been shown to provide an effective nuclear localization mechanism in 
plants (van der Krol et al. Plant Cell 3: 667-675 (1991)). However, several alternative 
5 NLSs are known in the art and can be used instead of the SV40 NLS sequence. These 
include the Nuclear Localization Signals of TGA-1 A and TGA-1B (van der Krol et al- 
Plant Cell 3: 667-675 (1991)). 

A variety of transformation vectors are available for plant transformation and the DNA 
10 binding protein encoding genes of the invention can be used in conjunction with any such 
vectors. The selection of vector will depend on the preferred transformation technique and 
the plant species which is to be transformed. For certain target species, different selectable 
markers may be preferred. 

1 5 For Agrobacterium-mediated transformation, binary vectors or vectors carrying at least one 
T-DNA border sequence are suitable. A number of vectors are available including pBIN19 
(Bevan, Nucl. Acids Res. 12: 871 1-8721 (1984), the pBI series of vectors, and pCIBlO and 
derivatives thereof (Rothstein et al. Gene 53: 153-161 (1987); WO 95/33818 to Ciba- 
Geigy). 
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Binary vector constructs prepared for Agrobacterium transformation are introduced into an 
appropriate strain of Agrobacterium tumefaciens (for example, LBA 4044 or GV 3101) 
either by triparental mating (Bevan; Nucl. Acids Res. 12: 8711-8721 (1984)) or direct 
transformation (Hofgen & Willmitzer, Nucl. Acids Res. 16: 9877 (1988)). 

For transformation which is not Agrobacterium-modiated (i.e. direct gene transfer), any 
vector is suitable and linear DNA containing only the construct of interest may be 
preferred. Direct gene transfer can be undertaken using a single DNA species or multiple 
DNA species (co-transformation; Schroder et al. Biotechnology 4: 1093-1096 (1986)). ■ 

Particularly useful for practising several embodiments of the present invention are 
expression vectors that provide for the transient expression of DNA encoding a DNA 
binding proiein in plant cells or mammalian cells. Transient expression usually involves 



the use of an expression vector that is able to replicate efficiently in a host cell, such that 
the host cell accumulates many copies of the expression vector, and, in turn, synthesises 
high levels of DNA binding protein. For the purposes of the present invention, transient 
expression systems are useful e.g. for identifying DNA binding protein mutants, to identify 
5 potential phosphorylation sites, or to characterise functional domains of the protein. 

Construction of vectors according to the invention employs conventional ligation 
techniques. Isolated plasmids or DNA fragments are cleaved, tailored, and religated in the 
form desired to generate the plasmids required. If desired, analysis to confirm correct 

10 sequences in the constructed plasmids is performed in a known fashion. Suitable methods 
for constructing expression vectors, preparing in vitro transcripts, introducing DNA into 
host cells, and performing analyses for assessing DNA binding protein expression and 
function are known to those skilled in the art. Gene presence, amplification and/or 
expression may be measured in a sample directly, for example, by conventional Southern 

15 blotting, Northern blotting to quantitate the transcription of mRNA, dot blotting (DNA or 
RNA analysis), or in situ hybridisation, using an appropriately labelled probe which may be' 
based on a sequence provided herein. Those skilled in the art will readily envisage how 
these methods may be modified, if desired. 

20 In accordance with another embodiment of the present invention, there are provided cells 
containing the above-described nucleic acids. Such host cells such as prokaryote, yeast and 
higher eukaryote cells may be used for replicating DNA and producing the DNA binding 
protein. Suitable prokaryotes include eubacteria, such as Gram-negative or Gram-positive 
organisms, such as E.coli, e.g. Kcoli K-12 strains, DH5a and HB101, or Bacilli. Further 

25 hosts suitable for the DNA binding protein encoding vectors include eukaryotic microbes 
such as filamentous fungi or yeast, e.g. Saccharomyces cerevisiae. Higher eukaryotic cells 
include plant cells and animal cells such as insect and vertebrate cells, particularly 
mammalian cells including human cells, or nucleated cells from other multicellular 
organisms. In recent years propagation of vertebrate cells in culture (tissue culture) has 

30 become a routine procedure. Examples of useful mammalian host cell lines are epithelial 
or fibroblastic cell lines such as Chinese hamster ovary (CHO) cells, NIH.3T3 cells, HeLa 
cells or 293T cells. The host cells referred to in this disclosure comprise cells in in vitro 
culture as well as cells that are within a multicellular host organism. 
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DNA may be stably incorporated into cells or may be transiently expressed using methods 
known in the art. Stably transfected cells may be prepared by transfecting cells with an 
expression vector having a selectable marker gene, and growing the transfected cells under 
conditions selective for cells expressing the marker gene. To prepare transient 
transfectants, cells are transfected with a reporter gene to monitor transfection efficiency. 

To produce such stably or transiently transfected cells, the cells should be transfected with 
a sufficient amount of the DNA binding protein-encoding nucleic acid to form the DNA 
binding protein. The precise amounts of DNA encoding the DNA binding protein may be 
empirically determined and optimised for a particular cell and assay. 



Host cells are transfected or, preferably, transformed with the above-mentioned expression 
or cloning vectors of this invention and cultured in conventional nutrient media modified 

15 as appropriate for inducing promoters, selecting transformants, or amplifying the genes 
encoding the desired sequences. Heterologous DNA may be introduced into host cells by 
any method known in the art, such as transfection with a vector encoding a heterologous 
DNA by the calcium phosphate coprecipitation technique or by electroporation. Numerous 
methods of transfection are known to the skilled worker in the field. Successful 

20 transfection is generally recognised when any indication of the operation of this vector 
occurs in the host cell. Transformation is achieved using standard techniques appropriate 
to the particular host cells used. 

Incorporation of cloned DNA into a suitable expression vector, transfection of eukaryotic 
25 cells with a plasmid vector or a combination of plasmid vectors, each encoding one or 
more distinct genes or with linear DNA, and selection of transfected cells are well known 
in the art (see, e.g. Sambrook et al. (1989) Molecular Cloning: A Laboratory Manual, 
Second Edition, Cold Spring Harbor Laboratory Press). 

30 Transfected or transformed cells are cultured using media and culturing methods known in 
the art, preferably under conditions whereby the DNA binding protein encoded by the DNA 
is expressed. The composition of suitable media is known to those in the art, so that they 
can be readily prepared. Suitable culturing media are also commercially available. 
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Transformation of plant cells is normally undertaken with a selectable marker which may 
provide resistance to an antibiotic or to a herbicide. Selectable markers that are routinely 
used in transformation include the nptll gene which confers resistance to kanamycin 
5 (Messing & Vierra Gene 19: 259-268 (1982); Bevan et al Nature 304: 184-187 (1983)), 
the bar gene which confers resistance to the herbicide phosphinothricin (White et al Nucl. 
Acids Res. 18: 1062 (1990); Spencer et al Theor. AppL Genet. 79: 625-631 (1990)), the 
hph gene which confers resistance to the antibiotic hygromycin (Blochlinger & 
Diggelmann MoL Ceil Biol. 4: 2929-2931 (1984)), and the dhfr gene which confers 
10 resistance to methotrexate (Bourouis et al. EMBO J 2: 1099-1 104 (1983)). More recently, 
a number of selection systems have been developed which do not rely of selection for 
resistance to antibiotic or herbicide. These include the inducible isopentyl transferase 
system described by Kunkel et al (Nature Biotechnology J/7: 916-919 (1999). 

15 Although specific protocols may vary from species to species, transformation techniques 
are well known in the art for most commercial plant species. 



In the case of dicotyledonous species, Agrobacterium-medizted transformation is generally 
a preferred technique as it has broad application to many dicotyledons species and is 
" 20 generally very efficient. . Agrobacteriwn-mediated transformation generally involves the 
co-cultivation of Agrobacterhim with explants from the plant and follows procedures and 
protocols that are known in the art. Transformed tissue is generally regenerated on medium 
carrying the appropriate selectable marker. Protocols are known in the art for many 
dicotyledonous crops including (for example) cotton, tomato, canola and oilseed rape. 
25 poplar, potato, sunflower, tobacco and soybean (see for example EP 0 317 51 1, EP 0 249 
432, WO 87/07299, US 5,795,855). 

In addition to Agrobacterium-msdmted transformation, various other techniques can be 
applied to dicotyledons. These include PEG and electroporation-mediated transformation 
30 of protoplasts, and microinjection (see for example Potrykus et al Mol. Gen. Genet. 199 : 
169-177 (1985); Reich et al Biotechnology 4: 1001-1004 (1986); Klein et al Nature 327: 
70-73 (1987)). As with Agrobacterium-medmtQd transformation, transformed tissue is 
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generally regenerated on medium carrying the appropriate selectable marker using standard 
techniques known in the art. | 

Although Agrobacterium-me&ated transformation has been applied successfully to 
monocotyledonous species such as rice and maize and protocols for these approaches are 
available in the art, the most widely used transformation techniques for monocotyledons 
remain particle bombardment, and PEG and electroporation-mediated transformation of 

protoplasts. 

In the case of maize, Gordon-Kamm et al. (Plant Cell 2: 603-618 (1990)), Fromm et al 
(Biotechnology 8: 833-839 (1990) and Koziel et al. (Biotechnology ii"- 194-200 (1993)) 
have published techniques for transformation using particle bombardment. 

In the case of rice, protoplast-mediated transformation for both Japonica- and Indica-types 
has been described (Zhang et al. Plant Cell Rep. 7: 379-384 (1988); Shimamoto et al. 
Nature 338: 274-277; Datta et al. Biotechnology 8: 736-740 (1990)) and both types are also 
routinely transformable using particle bombardment (Christou et al. Biotechnology 9: 957- 
962 (1991)). 

In the case of wheat, transformation by particle bombardment has been described for both 
type C long-term regenerable callus (Vasil et al. Biotechnology K): 667-674 (1992)) and 
immature embryos and immature embryo-derived callus (Vasil et al Biotechnology U: 
1553-1558 (1993); Weeks et al. Plant Physiol. 102: 1077-1084 (1993)). A further 
technique is described m published patent applications WO 94/13822 and WO 95/33818. 

The DNA binding protein constructs of the invention are suitable for expression in a 
variety of different organisms. However, to enhance the efficiency of expression it may be 
necessary to modify me nucleotide sequence encoding the DNA binding protein to account 
for different frequencies of codon usage in different host organisms. Hence it is preferable 
0 that the sequences to be introduced into organisms, such as plants, conform to preferred 
usage of codons in the host organism. 



In general, high expression in plants is best achieved from codon sequences that have a GC 
content of at least 35% and preferably more than 45%. This is thought to be because the 
existence of ATTTA motifs destabilize messenger RNAs and the existence of AATAAA 
motifs may cause inappropriate polyadenylation, resulting in truncation of transcription. 
Murray et al (Nucl. Acids Res. 17: 477-498 (1989)) have shown that .even within plants, 
monocotyledonous and dicotyledonous species have differing preferences for codon usage, 
with monocotyledonous species generally preferring GC richer sequences. Thus, in order 
to achieve optimal high level expression in plants, gene sequences can be altered to 
accommodate such preferences in codon usage in such a manner that the codons encoded 
by the DNA are not changed.' 

Plants also have a preference for certain nucleotides adjacent to the ATG encoding the 
initiating methionine and for most efficient translation, these nucleotides may be modified. 
To facilitate translation in plant cells, it is preferable to insert, immediately upstream of the 
ATG representing the initiating methionine of the gene to be expressed, a t: plant 
translational initiation context sequence". A variety of sequences can be inserted at this 
position. These include the sequence the sequence 5 '-AAGGAGATATAACA ATG -3 9 
(Prasher et al Gene 111: 229-233 (1992); Chalfie et al Science 263: 802-805 (1992)), the 
sequence 5'-GTCGACCATG-3' (Clontech 1993/1994 catalog, page 210), and the 
sequence 5 5 -TAA AC AATG-3 ' (Joshi et al Nucl. Acids Res. 15: 6643-6653 (1987)). For 
any particular plant species, a survey of natural sequences available in any databank {e.g. 
GenBank) can be undertaken to determine preferred "plant translational initiation context 
sequences" on a species-by-species basis. 

Any changes that are made to the coding sequence can be made using techniques that are 
well known in the art and include site directed mutagenesis, PCR, and synthetic gene 
construction. Such methods are described in published patent applications EP 0 385 962 
(to Monsanto), EP 0 359 472 (to Lubrizol) and WO 93/07278 (to Ciba-Geigy). Well 
known protocols for transient expression in plants can be used to check the expression of 
modified genes before their transfer to plants by transformation. 
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C. DNA binding ligands 
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A D NA binding l.gand accord,ng to the invents is typically any molecule capable of ( 
bindin- DNA. A var.ety of DNA binding hgands are known in the art and mclude acr,d,„e 
oranoe o.A mi no-6-c„,oro.2-me t hoxyacridine, achromycin D, 7-aminoact.nomycin D, 
echilomycin, d.hydroeth.dium, ethid.um-acnd.ne heterodimer, 
^opidiul todide, hex.dium iod.de, Hoechst 332 5 S, Hoeehst 33342, hydroxys,— 
solen, D,tamycm A, ca,icheam,in oligosacchandes, ***** ™ <£»J 
PNA and p y ro,e-,midazo,e pohyanndes. Also included within the mean.ng of the term 
DNA bind,ng l.gand are molecules capable of binding RNA and/or other nuc.e.c acds. 

binding DNA, RNA and/or other nucleic acids. 

ta a preferred embodiment, a DNA binding ligand according to the invention is capable of 
modulating the topo.ogy, local* or otherw.se, of the nucleic ac.d to whrch , , boun. 
Ocular, a DNA binding ligand accord.ng to the invention ma, be capab e of mo u n 
I topoiogy of a 3UXtaposed nucleic ac.d se q uence motif to wh.ch .t , desrred ,0 
DNA binding molecule according to the invention. 

' Purred DNA binding Ugands have shape and charge characteristics that allow them to 
reside along the DNA, .n either the minor or major groove, intercalate or a comh.nat.on or 



these. 



25 



30 



Suitable DNA b.nding ligands in addition to those known in the art may be selected by *e 

use of nucleic acid binding assays. For example, a candidate DNA b.n ,ng h anc, 

preferablv a plurality of candidate DNA binding ligands, , contacted with nude, ac. an 

dm. eteLned. The nucleic acds may for example be labelled with a detec, ,e labeh 

ch a a fluorophore/fluorochrome, such tha, after a wash step binding can be determined 

I , y for exam le by mon.tor.ng fluorescence. The nudeic acid with which the cand.date 

bind ng Ugands are contacted may be non-specific nucle.c acds, such - a random 

.Ugonucleotide libra,- or son.cated genomic DNA and the like. Alternately, a specf, 

i ~ r^H-^llv -an^mised librarv of sequences, 
sequence may oe usea or partially .an—- 
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It is particularly preferred that DNA binding ligands of the invention bind to DNA in a 
sequence and/or topology dependent manner so that binding can be restricted to a particular 
target DNA thus enhancing the specificity of the gene switch. Specificity of binding may 
5 be determined, for example, by comparing the binding of the DNA binding ligand to a 
target sequence with binding to a mixture of non-specific DNA molecules. 

DNA binding ligands according to the invention may bind conditionally to nucleic acid. 
For example, psoralen is a ligand that can bind DNA covalently if illuminated at 
10 wavelengths of about 400 run' or less. Ligands capable of binding nucleic acids in more 
than one manner may be employed in the current invention. Such ligands may bind or 
associate with the DNA via any one or more mechanism(s) as outlined above. 

In a preferred embodiment, libraries of DNA binding ligands may be prepared., In 
15 particular, libraries of DNA binding ligands may be immobilised to a solid phase, such as a 
substantially planar solid phase, including membranes and non-porous substrates such as 
plastic and glass. The resulting immobilised library may conveniently be used in high 
throughput screening procedures. 

20 Particularly preferred DNA binding ligands are those which are substantially non toxic to 
plants and or animal cells such that they may be administered to said cells and modulate 
binding of the DNA binding molecule without having an adverse effect on the cells. Thus 
it may be desirable to pre-screen compounds to exclude toxic compounds. 

25 Furthermore, given that DNA binding ligands should typically be capable of being taken up 
by the cells of animals or plants, preferred compounds are suitable for administration to 
animals and plants. For example, preferred compounds are capable of being taken up via 
the leaves (for foliar application) or roots of plants (for application to the soil) or of 
permeating seeds (for use in seed treatment). It may also preferred to use compounds that 

30 can be taken up by bacteria, yeast and/or fungi that can themselves be delivered to the 
target host organism. The compounds should also preferably be stable in the soil and/or 
plant for prolonged periods. In the case of animals, preferred compounds are suitable for 
topical or oral adminstration. 
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D. Target DNA 



The term ; target DNA' refers to any DNA for use in the methods of the invention. This 
5 DNA may be of known sequence, or may be of unknown sequence. This DNA may be 
prepared artificially in a laboratory, or may be a naturally occurring DNA. This DNA may 
be in substantially pure form, or may be in a partially purified form, or may be part of an 
unpurified or heterogeneous sample. Preferably, the target DNA is a putative promoter or 
other transcription regulatory region such as an enhancer. More preferably, the target DNA 
10 is in substantially pure form. Even more preferably, the target DNA is of known sequence. 
In a most preferred embodiment, the target DNA is purified DNA of known sequence of a 
promoter from a gene of interest, for example from a gene suspected of being associated 
with a disease state, more preferably from a gene useful in gene therapy. 

15 Examples of target sequences of interest include sequence motifs that are bound by 
transcription factors, such as zinc fingers. Particular examples include the promoters of 
genes involved in the biosythesis and catabolism of gibberellins (Phillips et al, Plnat 
Physiol 108: 1049-1057 (1995), MacMillin et al, Plant Physiol 113: 1369-1377 (1997), 
Williams et al, Plant Physiol 117: 559-563 (1998); Thomas et al, PNAS 96: 4698-4703 

20 (1999)); the promoters of genes whose products are reponsible for ripening (such as 
polygalacturonase and ACC oxidase; the promoters of genes involved in the biosythesis of 
volatile ester, which are important flavour compounds in fruits and vegetables (Dudavera et 
al, Plant Cell 8: 1137-1148 (1996); Dudavera et al, Plant J. 14: 297-304 (1998); Ross et 
al, Arch. Biochem. Biophys. 367: 9-16 (1999)); the promoters of genes involved in the 

25 biosynthesis of pharmaceutical^ important compounds; and the promoters of genes 
encoding allergens such as the peatnut allergens Arahl, Arah2 and Arah3 (Rabjohn et al, 
J. Clin. Invest 103: 535-542). 

Other plant promoters of interest are the bronze promoter (Ralston et al. Genetics 119: 
30 185-197 (1988) and Genbank Accession No. X07937.1) that directs expression of 
UDPglucose flavanoid glycosyl-transferase in maize, the patatin-1 gene promoter 
(Jefferson et al, Plant Mo. Biol. 14: 995-1006 (1990)) that contains sequences capable of 
directing tuber-specific expression, and the phenylalanine ammonia lyase promoter (Sevan 
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et ai. Embo J. 8: 1899-1906 (1989)) though to be involved in responses to mechanical 
^ wounding and normal development of the xylem and flower. 

Target DNA may also be provided as a plurality of sequences, for example where one or 
5 more residues in the nucleic acid sequence are varied or random. Examples of a plurality 
of sequences are libraries of nucleic acid sequences comprising putative zinc finger binding, 
sites. Other sequence motifs that bind the DNA binding domain of a transcription factor 
may also be included in the plurality of sequences, typically varied or randomised at one or 
more positions. For example the chemically inducible promoter fragments described above 
10 may be randomised to produce a plurality of target DNA sequences for use in the screening 
methods of the present invention. 



E. Assays 

15 The methods of the present invention typically involve using a tripartite configuration of 
one or more DNA binding molecules, one or more DNA binding ligands and one or more 
target DNA sequences as described above to screen for (i) DNA binding molecules that 
bind to a target DNA in a manner that is modulatable by a DNA binding ligand (ii) DNA 
binding ligands that modulate binding of a DNA binding molecule to a target DNA and/or 

20 (iii) a target DNA that is bound modulatably by a DNA binding molecule as a result of an 
interaction with a DNA binding ligand. In other words the methods of the invention may 
be used to screen for any or all of the components of the gene switch system of the present 
invention. 

25 Typically, one or two of the components is a known constant while two or one, 
respectively, of the other components are screened. For example, a given DNA binding 
molecule and target DNA may be used to screen a plurality of DNA binding ligands or 
candidate DNA binding ligands. Alternatively, a plurality of DNA binding molecules and 
of DNA binding ligands may be screened against a given target DNA. Other combinations 

30 are also envisaged. 

Each component may be one individual molecular species or a plurality of molecular 
species. Where a plurality of species is used, they may be substantially all known, partially 
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randomised or fully randomised. For example, the plurality of DNA binding molecules 
mav be a randomised zinc finger library and the plurality of target DNA may be a library of 
nucleic acid molecules randomised at one or more, typically three or more contiguous, 
residues. 

However, all three components may be screened for simultaneously. Thus, in a preferred 
embodiment, the invention provides a method for isolating multiple DNA binding 
molecules in the presence of multiple DNA binding ligands, said DNA binding molecules 
being selected using multiple target nucleic acid sequences in a single selection (isolation) 
10 procedure. 

The library of candidate DNA binding molecules is preferably a phage display library, 
individual candidate molecules of the library optionally being structurally related to zinc 
finger transcription factors (for example see Choo and Klug, (1994) PNAS (USA) 
15 91T1163-67, which describes aspects of such libraries and is incorporated herein by 
reference). This library is preferably constructed with DNA sequences of the form 
GCGNNNGCG (where all 64 middle triplets are represented in the mixture). 

One or more DNA binding ligands means at least one DNA binding ligand, preferably two, 
,0 three or four DNA binding ligands, more preferably five, six, or seven DNA binding 
liaands, most preferably a mixture of eight DNA binding ligands, or even more. The 
ligands may be in any molar ratio to one another within the mixture, but will preferably be 
approximately equimolar with one another. 

25 Said method would preferably be carried out over at least 3, 4, 5 or 6 rounds of selection, 

preferably about 6 rounds of selection. 

DNA binding molecules (such as phage clones) isolated by the above methods would 
preferably be individually assayed (for example in microtitre plates as described below) for 
30 binding to the target DNA (such as a GCGNNNGCG mixture) in the presence and absence 
of a mixture of the DNA binding ligands to identify clones which are capable of ligand- 

modulatable binding. 
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Those phage clones which are capable of ligand-modulatable binding would preferably be 
tested in the presence of a mixture of the eight ligands, in order to deduce the optimum 
target DNA sequence, for example using different or variant target DNA sequences, or by 
the binding site signature method method (see Choo and Klug, (1994) PNAS (USA) 
5 91:11163-67). 

Where candidate DNA binding molecules are used rather than molecules known or 
determined to have DNA binding properties, the method of the invention would preferably 
feature a pre-selection step to remove candidate DNA binding molecules which do not 
10 require ligand to bind the DNA. 

Association of the candidate DNA binding molecule with the target DNA may be assessed 
by any suitable means known to those skilled in the art. For example, the DNA may be _ 
immobilised by biotinylation and linking to beads such as streptavidin coated beads 
15 (Dynal). In a preferred embodiment wherein the DNA binding molecules are phage 
' displayed polypeptides, binding of said molecules to the DNA may be assessed by eluting 
those phage which bind, and infecting logarithmic phase E.coli TGI cells. The presence of 
infective particles eluted from the DNA indicates that association of the DNA binding 
molecule(s) with the DNA has occurred. Alternatively, association of the candidate DNA 
20 binding molecule(s) with the target DNA may be assessed by Scintillation Proximity Assay 
(SPA). For example, the target DNA could be biotinylated and immobilised to streptavidin 
coated SPA beads, and the candidate DNA binding molecules may be radioactively 
labelled, for example with 35 S-Methionine where the molecules are polypeptides. 
Association of the candidate DNA binding molecules with the target DNA could then be 
25 assessed by monitoring the readout of the SPA. Alternatively, the association could be 
monitored by fluorescent resonance energy transfer (FRET). In this case, the target DNA 
could be labelled with a donor fluor, and the DNA binding molecule(s) could be labelled 
' with asuitable acceptor fluor. Whilst the two entities are seperated, no FRET would be 
observed, but if association (binding) took place, then there would be a change in the 
3 0 amount of FRET observed, this allowing assessment of the degree of associaiton. 

Association of the candidate DNA binding molecule with the target 1 DNA may also be 
assessed by bandshift assays. Bandshift assays are conducted by measuring the mobility of 
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one or more of the components of the assay, for example the mobility of the DNA, as it is 
electrophoresed through a suitable gel such as a polyacrylamide acrylamide gel, as is well 
known to those skilled in the art. In order to assess the association of the candidate DNA 
binding molecule with the target DNA, the mobility of the DNA could be measured in the 
5 presence and absence of the candidate DNA binding molecule. If the mobility of the target 
DNA is essentially the same in the presence or absence of the candidate DNA binding 
molecule, then it may be inferred that the molecules do not associate, or that the association 
is weak. If the mobility of the DNA is retarded in the presence of the candidate DNA 
binding molecule, then it may be inferred that the candidate molecule is associating with or 
10 binding to the DNA. 

Association of the candidate DNA binding molecule with the target DNA may also be 
assessed using filter binding assays. For example, the target DNA molecule may be 
immobilised on a suitable filter, such as a nitrocellulose filter. The candidate DNA binding 

15 molecule may then be labelled, for example radioactively labelled, and contacted with the 
immobilised target DNA. The binding of or association with the target DNA may be 
assessed by comparing the amount of labelled candidate DNA binding molecule which 
associates with the filter only to the amount of labelled candidate DNA binding molecule 
which associates with the filter-immobUised target DNA. If more labelled candidate DNA 

20 binding molecule associates with the immobilised DNA than with the filter only, it may be 
inferred that the target DNA molecule does indeed associate with the candidate binding 
molecule. 

Binding affinities may be estimated by any suitable means known to those skilled in the art. 
25 Binding affinities for the purposes of this invention may be absolute or may be relative. 
Binding affinities may be determined biochemically, or may simply be estimated by 
assessing the association of the candidate DNA binding molecule with the target DNA as 
described above. As used herein, the term binding affinity may refer to a simple estimation 
of the association of one component of the system with another. 

30 

Another suitable detection method is the use of target DNA sequences linked to reporter 
constructs, such as bacterial luciferase or lacZ. Preferably, the reporter gene product can be 
measured using optica! detection techniques. By way of example, a multiarray format 
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could be used with a different candidate ligand in each position in the array (such as a 
microtia plate well) and the- same library of zinc finger proteins and target DNA 
sequences at -each position. The zinc finger proteins will generally be fused to a 
transcriptional activation domain such as the GAL4 acidic activation domain. 
Transcription may then be compared in the various wells and wells showing a variation in 
transcription compared to a control well with no ligand may be selected and the ligand 
further tested to identify specific target sequences/zinc finger proteins whose interaction is 
affected. These further tests may again be performed using an array format in which this 
time the DNA binding ligand is kept constant and the target sequence/zinc fingers varied. 
Phase display techniques as described above may be used to simplify the isolation of 
suitable zinc finger proteins. Although described in the context of zinc fingers, this method 
could be applied to other DNA binding molecules. 

It is envisaged that the methods of the invention may be applied in vivo, for example they 
15 could be applied to the selection or isolation of DNA binding molecules capable of 
associating with target DNA in vivo inside one or more cells, in a manner analagous to the 
one-hybrid system. 

It is envisaged that the methods of the invention may be practised in parallel. For example, 
multiple target DNAs could be used in a single selective step, thereby enabling multiple 
DNA binding molecules to be isolated simultaneously, even in the same physical vessel. 
Said multiple DNA binding molecules may preferably be different from one another. Said 
multiple DNA binding molecules may have similar or identical DNA binding specificities, 
or may preferably have different DNA binding specificities. 



20 



25 



30 



The invention may be worked using multiple DNA binding ligands, either separately or in 
combination. For example, a target nucleic acid sequence may be used to isolate DNA 
binding molecules according to the methods essentially as disclosed above, with the 
modification that more than one DNA binding ligand may be present. In this way, it is 
possible to isolate multiple DNA binding molecules which require different ligands to bind 
to the same target nucleic acid sequence(s). . 

By way of example, a particular embodiment of the method of the invention is as follows: 
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1 Bacterial colonies containing phage libraries that express a library .of zinc fingers 
randomised at one or more DNA binding residues (see section A.) are transferred from I 
plates to culture medium. Bacterial cultures are grown overnight at 30°C. Culture 
supernatant containing phages is obtained by centrifugation. 

2 10 pmol of biotinylated target DNA immobilised on 50 mg streptavidin beads 
(Dynal) is incubated with 1 ml of the bacterial culture supernatant diluted 1:1 with PBS 
contaimng 50 uM ZnCb 4% Marvel, 2% Tween for 1 hour at 20°C on a rolling platform 
as a preselection step to remove phage that bind to the target DNA in the absence of a 

lieand. 

3~ After this time. 0.5 ml of phage solution is transferred to a streptavidin coated tube 
and incubated with biotinylated DNA target site in the presence of a candidate DNA 
binding ligand and 4 ug poly [d(I-C)]. After a one hour incubation the tubes are washed 20 
times with PBS containing 50 uM ZnCl 2 and 1% Tween, and 3 times with PBS containing 

50 jiM ZnCb to remove non-binding phage. 

4. The remaining phage are eluted using 0.1 ml 0.1 M triethylamine and the solution is 
neutralised with an equal volume of 1 M Tris-Cl (pH 7.4). 

5 Logarithmic-phase E. coli TGI cells are infected with eluted phage, and grown 
overnight, as described above, to prepare phage supernatants for subsequent rounds of 

selection. 

6 After 4 rounds of selection (steps 1 to 5), bacteria are plated and phage prepared 
from 96 colonies are screened for binding to the DNA target site in the presence and 
absence of the ligand. Binding reactions are carried out in wells of a streptavidin-coated 
microtitre plate (Boehringer Mannheim) and contain 50 ul of phage solution (bacterial 
culture supernatant diluted 1 : 1 with PBS containing 50 uM ZnCl 2 , 4% Marvel, 2% Tween), 

0. 15 pmol DNA target site and 0.25 ug poly [d(I-C)]. When added, the DNA binding 
ligand is present at a concentration of about 1 uM. 

1. After a one hour incubation the wells are washed 20 times with PBS containing 
50 uM ZnCb and 1% Tween (and also ligand at a concentration of 1 uM where 

) appropriate), and 3 times with PBS containing 50 uM ZnCb. 

8 Bound phage are detected by ELISA (carried out in the presence of the ligand at a 
concentration of about 1 aM where appropriate) with horseradish peroxidase-conjugated 
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and-M'13 IgG (Pharmacia Biotech) and quantitated using SOFTMAX 2.32 (Molecular 

Devices). 

9. ■ Single- colonies -of transformants obtained after four rounds of selection as 
described Ire grown overnight in culture. Single-stranded DNA is prepared from phage in 
the culture supernatant and sequenced using the Sequenase™ 2.0 kit (U.S. Biochemical 
Corp.). The amino acid sequences of the zinc finger clones are deduced. 

In the above example, only one target DNA sequence was used. Where a library of DNA 
sequences is used, the library of sequences can be screened using the ligand and selected 
phage expressing the zinc finger of interest to identify specific target DNA sequences. This 
may conveniently be carried out with the DNA sequences arrayed onto a solid substrate. 

In the above, .example, the zinc fingers (DNA binding molecules) are present on phage. 
However, alternative methods for displaying the DNA molecules could be used. As 
descibed in section A above, an entirely in vitro polysome display system has also been 
reported (Mattheakis et al, (1994) Proc Natl Acad Sci U S A, 91, 9022-6) in which nascent 
peptides are physically attached via the ribosome to the RNA which encodes them. Using 
a library of RNA/ribosomes expressing the DNA binding molecules, screening is 
performed in a similar manner to the phage display method except that typically, after an 
initial preselection step to remove DNA binding molecules that bind in the absence of the 
ligand only one selection step is performed and the resulting DNA binding molecules 
identified by cloning the RNA from the RNA/ribosome complexes and sequencing the 
clones obtained. 

To assist in isolating and/or identifying complexes comprising a target DNA, a DNA 
bindin 2 molecule and a DNA binding ligand, it may be desirable to label one or more of 
the components with a detectable label. For example, the DNA may be labelled with a 
fluorescent tag and the DNA binding molecule labelled with biotin, such that an enzyme 
'conjugate such as horse radish peroxidase (HRP), that catalyses an optically detectable 
change in a substrate (different from the fluorescent tag) can be used. If the DNA binding 
ligand is attached to a bead, then tripartite complexes can be detected because they will 
both fluoresce and give HRP activity. 
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, • , ■ , e ful where multiple candidate DNA binding ligands are to be 
A further method which is useful where P 

„f h M ds to which are attached different peptide ta s s 
screened involves the use of beads ^ 

peptide tag can be used to e ^ ^ be 

Complexes con^in. ^ ^ molecules as tenb ed 

identified by the use of lab lied « ^ ^ ^ ^ ^ rf ^ 

above. Beads comprtsmg a wffl ^ ^ , he ldentlty of the llg and. 

tag determined by spectroscopy techniques wm 

In genera,, a bead format is advantageous - * -~ — *— " 
tripartite complexes and prescreening. 

• ♦• nNA bindina molecules according to the invention 
In a further aspect of the ^^^^ CQmposition of a sample of target 
may be advantageous^ ^£ ^ ^ ^ ^ _ be ^ 
5 DNA. For example, a DNA bmdin. ^ ^ 

which binds to a known target DNA sequence. By * t0 

w DNA samples and monitoring its binding thereto, y 
it with, one or more test DNA sampi recocm ition site of the 

, i_ -a n>JA samole(s) contain the cognate DNA reco 0 n 
define whether sa,d DNAW C ) ^ ^ Mmposition 

™ A Wndingm0 ' eCUl ^::; h rl a-l ses ma, - advantageously — d using the 

„ ph a,^ C _^^ 
sequence(s) in the presence or absence 
25 ligand modulates binding. 

r, „„e li-and modulates binding of DNA binding molecules 
Clearly, it ma y be that more man one 1 a^ ^ ^ ^ f ^ 

t0 their cognate DNA — ^ ^ ^ DNA sequenoe(s) in me presence of 
ph age clones) may be assayed for b m ^ ^ ^ ^ 

30 discrete ligand mixtures, wherem each 1 gand m« P ^ ^ ^ 

ofU gands. M^^JZZX:^ may advantageous,, be 

bmdin i :v: z«~ — . _ ^ ^ x - - 

determined, ror examine, ^ ^ ^ 



other lacking „ga„ d Y . are incapable „ f ^ ^ ^ ^ ^ ^ ^ 

n " y ,MVe effiC ' ° f m ° dUa,i ^ *« bi «*" g - This could advantageously be further 
mvestigated according ,o the methods of (he invention as described herein. 

5 I. ^envisaged that this i„ven tl on may be advantageous,, used in the isolatton of a DNA 
b.nd.ng ligand th a, is capable of modulating the association of a parties DNA bindin, 
molecule „„h its target DNA se qu e„ce. Accordingly, the invention provides a method f„ r 
■so a«,ng on e or more DNA bindtng Iigands, said iigands each .binding one or more ,ar«e, 
DNA sequences), wherein said binding to one or mo re targe, DNA se q uence(s) modulates 
the binding „f one „ r more DNA binding moiecuies, and wherein said DNA binding 
molecule(s) and said DNA binding Iigands are different, said method compnsing: 

a) providing one or more target DNA molecule(s); 

b) contacting lhe targe, DNA molecule( S ) with one or more DNA binding mol ecule(s) 

c) providing a library of candidate DNA binding Iigands, 

d) assessing the abii.ty of candidate DNA binding iigands to modulate the association of 
the DNA binding molecule(s) with the target DNA molecule(s)- and 

e) isolating those candidate DNA binding Iigands which modulate the association of the " 
DNA btnding molecule(s) with the target DNA molecule(s). 

In order «o remove DNA binding molecules (for example phage dispiayed po.ypeptides) 
which bind DNA in a l.gand-independen, manner from a library, a pre-selection step mav 
opfonaily be performed in the absence of ligand prior to each round of selection. This step 
removes from the Horary those Cones which do not require ligand for DNA bind.n. 
Optionally, candidate molecu.es selected in this manner may be screened by ELISA for 
bmdmg to the DNA target in the presence or absence of the ligand(s). 

In die above described methods, in order to remove DNA binding molecules (for example 
P age displayed polypeptides) which bind DNA in a Hgand-dependen, manner from a 
bbrary, a pre-selection step may optionally be performed in the presence of ,i=and prior to 
each round of selection. This step removes from the library those Cones Jhich require 
bgand for DNA binding. Optional*, candidate molecules seiected in th, s maimer may be 
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screened by BUS A for binding to the DNA target in the presence or absence of the 
ligand(s). 

It is envisaged that the methods of the current invention may be advantageously applied to 
the selection of molecules capable of binding nucleic acids other than DNA. for example 
RNA. Structural considerations of RNA binding molecules are discussed in Afshar et al 
(Afshar et aL 1999: Cur, Op. Biotech, vol 10 pages 59-63). In particular, ligands suitable 
for use m the methods of the invention as applied to RNA include those ligands described 
above, or may be selected from aminoglycosides and their derivatives such as 
paromomycin, neomycin (for examples see Park et al, 1996: J. Am. Chem. Soc. vol 118 
PP1O150-10155); aminoglycoside mimetics (Tok and Rando 1998: J. Am. Soc. Chem. vol 
120 PP 8279-8280); acridine derivatives (for examples see Hamy et al, 1998: Biochemistry 
vol 37 PP5086-5095); small peptides ('aptamers'); polycationic compounds (for example's 
see Wang et al, 1998: Tetrahedron 54 PP 7955-7976) or any other nucleic acid bindin* 
molecules known to those skilled in the art. In a preferred embodiment, derivatives or 
libraries of said nucleic acid binding ligands may be prepared. 

Accordingly, the present invention provides a method for isolating an RNA binding 
molecule which, binds to a target RNA molecule in a manner modulatable by a RNA- 
binding ligand, wherein said RNA-binding ligand and said RNA-binding molecule are 
different, said method comprising; providing a target RNA molecule; 



(a) contacting the target RNA molecule with a RNA-binding ligand, to produce a 

RNA-ligand complex; 

25 (b) assessing the ability of candidate RNA-binding molecules to bind the target RNA 
molecule and the RNA-ligand complex; and isolating those candidate RNA-bindino 
molecules which bind the target RNA molecule and RNA-ligand complex with different 



binding affinities. 



30 It is further envisaged that the methods of the invention may be advantageously used to 
select nucleic acid sequences which allow binding of a particular DNA bindina 
Iigand/DNA binding molecule combination. For example, one may wish to isolate 

panicu.ar l_,A sequences to which a aiven DNA bindina m^u^.io ;~ ^ v ♦ 

i. aw.. ■ — auvic tO OillQ. Or tO 



■sola* only those DNA sequences which depend on ,he presence of ligand f or the DNA 
binding molecule to associate with them. 

According, there is provided a method for isoiating target DNA sequences to which a 
> part.cular DNA binding molecule will bind, said method comprising 

a) providing a library of target nucleic acid molecuie(s); 

b) contacting said nucleic acid molecules with a DNAWnding molecule in the presence 
or absence of DNA binding ligand 

O assessing the ability of the candidate target DNA molecule(s) to bind the DNA bindino 

molecule; and to 

d) isolating those targe, nucleic acid molecules which bind the DNA binding molecule. 

A hbrary of targe, nucleic acid moleculefs) according to the invention may preferably 
comprise a plurahty of different nucleic acid molecules; preferably said nucletc acid 
molecules may be related to one another in terms of sequence homology. 

A library of candidate nucleic acid binding molecule(s) according to the invention may 
preferaMy comprise a plurahty of different candidate nucleic acid binding polypeptides- 
preferably said candidate nucle.c acid binding polypeptides may be related ,o one anoto 
m terms of amino acid sequence homology. 

It is envisaged that this method could be advantageousiy used in order ,o isolate DNA 
sequences which require ligand to associate with a known DNA binding molecule For 
example, mere may be a DNA sequence which is bound by a known DNA bindin* 
molecule in a ligand-independen, manner, and i, may be desirable to find a DNA 
sequences) which can also associate with the same wi,d-type DNA binding mo.ecule bu, 
whtch do so in a li g and-modu,a.able manner. Preferabiy, this may be accompHshed 
according to the above method of the present invention. 
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The assay methods of the invention may be used to identify DNA binding molecules, DNA 
binding ligands and/or target DNA where the binding the DNA binding molecule to the 
target DNA is modulatable by the DNA binding ligand. 

These components, such as DNA binding proteins according to the invention and identified 
by the assay methods of the invention, may be used individually or in combination in a 
wide variety of applications. 

Thus, DNA binding proteins according to the invention and identified by the assay methods 
of the invention may be employed in a wide variety of applications, including diagnostics 
and as research tools. Advantageously, they may be employed as diagnostic tools for 
identifying the presence of particular nucleic acid molecules in a complex mixture. DNA 
binding molecules according to the invention can preferably differentiate between different 
target DNA molecules, and their binding affinities for the DNA target sequences are 
preferably modulated by DNA binding ligand(s). DNA binding molecules according to the 
invention are useful in switching or modulating gene expression, especially in gene therapy 
applications and agricultural biotechnology applications as described below. 

Specifically, targeted DNA binding molecules, such as zinc fingers, according to the 
invention may moreover be employed in the regulation of gene transcription, for example 
by specific cleavage of nucleic acid sequences using a fusion polypeptide comprising a zinc 
finger targeting domain and a DNA cleavage domain, or by fusion of an transcriptional 
effector domain to a zinc finger, to activate or repress transcription from a gene which 
possesses the zinc finger binding sequence in its upstream sequences. Preferably, 
activation or repression only occurs in the presence of the DNA binding ligand, since in a 
preferred embodiment the zinc fingers will not bind their target nucleic acid sequences in 
the absence of the ligand. Alternatively, activation only occurs in the absence of the DNA 
binding ligand, since the zinc fingers may not bind their target nucleic acid sequences in 
the presence of the ligand. Zinc fingers capable of differentiating between U and T may be 
used to preferentially target RNA or DNA, as required. Where RNA-targethg 
polypeptides are intended, these are included in the term "DNA binding molecule" 
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Thus DNA binding molecules according^ the invention will typically require the presence 
of a transcriptional effector domain, such as an activation domain or a .repressor domain 
Examples of transcriptional activation domains include the VP 16 and VP64 transactivation 
domains of Herpes Simplex Virus. Alternative transactivation domains are various and 
include the maize CI transactivation domain sequence (Sainz et al., 1997, Mol Cell Biol 
17: 115-22) and PI (Goffrf*/., 1992, Genes Dev. 6: 864-75; Estruch et al.. 1994 Nucleic 
Acids Res. 22: 3983-89) and a number of other domains that have been reported from 
plants (see Estruch et al. , 1 994, ibid). 

Instead of incorporating a transactivator of gene expression, a repressor of gene expression 
can be fused to the DNA binding protein and used to down regulate the expression of a 
gene contiguous or incorporating the DNA binding protem target sequence Such 
repressors are known in the art and include, for example, the KRAB-A domain (Moosmann 
et al, Biol. Chem. 378: 669-677 (1997)) the engrailed domain (Han et al, Embo J. 12: 
2723-2733 (1993)) and the snag domain (Grimes et al., Mol Cell. Biol. 16: 6263-6?79 
(1 996)). These can be used alone or in combination to down-regulate gene expression. 

Another possible application is the use. of zinc fingers fused to nucleic acid cleavaoe 
moieties, such as the catalytic domain of a restriction enzyme, to produce a restriction 
enzyme capable of cleaving only target DNA of a specific sequence (see Kim et al (1996) 
Proc. Natl. Acad. Sci. USA 93:1 156-1 160). Using such approaches, different DNA 
binding domains can be used to create restriction enzymes with any desired recognition 
nucleotide sequence, but which cleave DNA conditionally dependent on the presence "or 
absence of a particular DNA binding ligand, for instance Distamycin A. It may also be 
possible to use enzymes other than those that cleave nucleic acids for a variety of purposes. 

In a preferred embodiment, the zinc finger polypeptides of the invention may be employed 
to detect the presence of a particular target nucleic acid sequence in a sample. 

30 Accordingly, the invention provides a method for determining the presence of a target 
nucleic acid molecule, comprising the steps of: 



25 
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a) preparing a DNA binding protein by the method set forth above which is specific for 
the target nucleic acid molecule; 

b) exposing a test system which may comprise the target nucleic acid molecule to the 
DNA binding protein under conditions which promote binding, and removing any DNA 

5 binding protein which remains unbound; 

c) detecting the presence of the DNA binding protein in the test system. 

Regulation of gene expression in vivo 

10 In a particularly preferred embodiment of the present invention, DNA binding molecules 
capable of binding to a target DNA in a manner modulatable by a DNA binding ligand are 
used to regulate expression from a gene in vivo. 

The target gene may be endogenous to the genome of the cell or may be heterologous. 

15 However, in either case it wall comprise a target DNA sequence, such as a target DNA 
sequence described above, to which a DNA binding molecule of the invention binds in a 
manner modulatable by a DNA binding ligand. Where the DNA binding molecule is a 
polypeptide, it may typically be expressed from a DNA construct present in the host cell 
comprising the target sequence. The DNA construct is preferably stably integrated into the 

20 genome of the host cell, but this is not essential. . 

Thus in the case of polypeptide DNA. binding molecules, a host cell according to the 
invention comprises a target DNA sequence and a construct capable of directing expression 
of the DNA binding molecule in the cell. 

25 

Suitable constructs for expressing the DNA, binding molecule are known in the art and are 
described in section B above. The coding sequence may be expressed constitutively or be 
regulated. Expression may be ubiquitous or tissue-specific. Suitable regulatory sequences 
are known in the art and are also described in section B above. Thus the DNA construct 
30 will comprise a nucleic acid sequence encoding a DNA binding molecule operably linked 
to a regulatory sequence capable of directing expression of the DNA binding molecule in a 
host cell. 
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It may also be desirable to use target DNA sequences that include operably linked 
neighbouring sequences that bind transcriptional regulatory proteins, such as 
transact! vators. Preferably the transcriptional regulatory proteins are endogenous to the 
cell. If not, they typically will need to be introduced into the host cell using suitable 
nucleic acid constructs. 

Techniques for introducing nucleic acid constructs into host cells are known in the art for 
both prokaryotic and eukaryotic cells, including yeast, fungi, plant and animal cells. Many 
of these techniques are mentioned below in the section on the production of transgenic 
organisms. 

Regulation of expression of the gene of interest which comprises a second coding sequence 
operably linked to the target DNA sequence is typically achieved by administering to the 
cell a DNA binding iigand according to the invention. Typically, the DNA binding ligand 
is a molecule such as Distamycin A which may be administered exogenously to the cell and 
taken up by the cell whereupon it may contact the DNA binding molecule and modulate its 
binding to the target sequence. However polypeptide DNA binding ligands may also be 
introduced into the cell either directly or by introducing suitable nucleic acid vectors, 
including viruses. 

The target DNA sequence and the DNA construct encoding the DNA binding molecule are 
preferably stably integrated into the genome of the host cell. Where the host cell is a single 
celled organism or part of a multicellular organism, the resulting organism may be termed 
transgenic. The target DNA may, in a preferred embodiment, be a naturally occurring 
sequence for which a corresponding DNA binding molecule and DNA binding ligand have 
been identified using the screening methods of the invention. 

The term "multicellular organism" here denotes all multicellular plants, fungi and animals 
except humans, i.e. prokaryotes and unicellular eukaryotes are excluded specifically. The 
term also includes an individual organism in all stages of development, including 
embryonic and fetal stages. A "transgenic" multicellular organisms is any multicellular 
organism containing cells that bear genetic information received, directly or indirectly, by- 
deliberate genetic manipulation at the subcellular level, such as by microinjection or 
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infection with recombinant virus. Preferably, the organism is transgenic by virtue of 
comprising at least a heterologous nucleotide sequence encoding a DNA binding molecule 
or target DNA as herein defined. 

5 "Transgenic" in the present context does not encompass classical crossbreeding or in vitro 
fertilization, but rather denotes organisms in which one or more cells receive a recombinant 
DNA molecule. Transgenic organisms obtained by subsequent classical crossbreeding or 
in vitro fertilization of one or more transgenic organisms are included within the scope of 
the term "transgenic". 

10 

The term "germline transgenic organism" refers to a transgenic organism in which the 
genetic information has been taken up and incorporated into a germline cell, therefore 
conferring the ability to transfer the information to offspring. If such offspring, in fact, 
possess some or all of that information, then they, too, are transgenic multicellular 
15 organisms within the scope of the present invention. 

The information to be introduced into the organism is preferably foreign to the species of 
animal to which the recipient belongs (i.e., "heterologous"), but the information may also 
be foreign only to the particular individual recipient, or genetic information already 
20 possessed by the recipient. In the last case, the introduced gene may be differently 
expressed than is the native gene. 

"Operably linked" refers to polynucleotide sequences which are necessary to effect the 
expression of coding and non-coding sequences to which they are ligated. The nature of 

25 such control sequences differs depending upon the host organism; in prokaryotes, such 
control sequences generally include promoter, ribosomal binding site, and transcription 
termination sequence; in eukaryotes, generally, such control sequences include promoters 
and a transcription termination sequence. The term "control sequences" is intended to 
include, at a minimum, components whose presence can influence expression, and can also 

30 include additional components whose presence is advantageous, for example, leader 
sequences and fusion partner sequences. 
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Since the nucleic acid constructs are typically to be integrated into the host' genome, it is 
important to include sequences that will permit expression of polypeptides in a particular 
genomic context. One possible approach would to use homologous recombination to 
replace all or part of the endogenous gene whose expression it is desired to regulate with 
equivalent sequences comprising a target DNA in its regulatory sequences. This should 
ensure that the gene is subject to the same transcriptional regulatory mechanisms as the 
endogenous gene, with the exception of the target DNA sequence. Alternatively, 
homologous recombination may be used in a similar manner but with the regulatory 
sequences also replaced so that the gene is subject to a different form of regulation. 

However, if the construct encoding either the DNA binding molecule or target DNA is 
placed randomly in the genome, it is possible that the chromatin in that region will be 
transcriptionally silent and in a condensed state. If this occurs, then the polypeptide will hot 
be expressed - these are termed position-dependent effects. To overcome this problem, it 
may be desirable to include locus control regions (LCRs) that maintain the intervening 
chromatin in a transcriptionally competent open conformation. LCRs (also known as 
scaffold attachment regions (SARs) or matrix attachment regions (MARs)) are well known 
in the art - an example being the chicken lysozyme A element (Stief et al., 1989, Nature 
341: 343), which can be -positioned around an expressible gene of interest to effect an 
increase in overall expression of the gene and diminish position dependent effects upon 
incorporation into the organism's genome (Stief et al, 1989, supra). Another example is 
the CD2 gene LCR described by Lang et al, 1991, Nucl. Acid. Res. 19: 5851-5856. 

Thus, a polynucleotide construct for use in the present invention, to introduce a nucleotide 
sequence encoding a DNA binding molecule into the genome of a multicellular organism, 
typically comprises a nucleotide sequence encoding the DNA binding molecule operably 
linked to a regulatory sequence capable of directing expression of the coding sequence. In 
addition the polynucleotide construct may comprise flanking sequences homologous to the 
host cell organism genome to aid in integration. An alternative approach would be to use 
viral vectors that are capable of integrating into the host genome, such as retroviruses. 



Preferably, a nucleotide construct for use in the present invention further 
flanking LCRs. 
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Construction of Transgenic Organisms Expressing DNA Binding Molecules 

A transgenic organism of the invention is preferably a multicellular eukaryotic organism, 
5 such as an animal, a plant or a fungus. Animals include animals of the phyla cnidaria, 
ctenophora, platyhelminthes, nematoda, annelida ; mollusca, chelicerata, uniramia, 
Crustacea and chordata. Uniramians include the subphylum hexpoda that includes insects 
such as the winged insects. Chordates includes vertebrate groups such as mammals, birds, 
reptiles and amphibians. Particular examples of mammals include non-human primates, 
10 cats, dogs, ungulates such as cows, goats, pigs, sheep and horses and rodents such as mice, 
rats, gerbils and hamsters. 

Plants include the seed-bearing plants angiosperms and conifers. Angiosperms include 
dicotyledons and monocotyledons. Examples of dicotyledonous plants include tobacco, 
15 (Nicotiana plumbaginifolia and Nicotiana tabacum), arabidopsis (Arabidopsis thaliana), 
Aspergillus niger, Brassica napus, Brassica nigra, Datura innoxia, Vicia narbonensis, 
Vicia faba, pea (Pisum sativum), cauliflower, carnation and lentil (Lens culinaris). 
Examples of monocotyledonous plants include cereals such as wheat, barley, oats and 
maize. 

20 

Production of transgenic animals 

Techniques for producing transgenic animals are well known in the art. A useful general 
textbook on this subject is Houdebine, Transgenic animals - Generation and Use (Harwood 
25 Academic, 1997) - an extensive review of the techniques used to generate transgenic 
animals from fish to mice and cows. 

Advances in technologies for embryo micromanipulation now permit introduction of 
heterologous DNA into, for example, fertilized mammalian ova. For instance, totipotent or 
30 pluripotent stem cells can be transformed by microinjection, calcium phosphate mediated 
precipitation, liposome fusion, retroviral infection or other means, the transformed cells are 
then introduced intc the embryo, and the embryo then develops into a transgenic animal. In 
a hishiy ^referred method, developing embryos are infected with, a retrovirus containing 
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the desired DNA, and transgenic animals produced from the infected embryo. In a most 
preferred method, however, the appropriate DNAs are coinjected into the pronucleus or 
cytoplasm of embryos, preferably at the single cell stage, and the embryos allowed to 
. develop into mature transgenic animals. Those techniques as well known. See reviews of 
standard laboratory procedures for microinjection of heterologous DNAs into mammalian 
fertilized ova, including Hogan et aL, Manipulating the Mouse Embryo, (Cold Spring 
Harbor Press 1986); Krimpenfort et aL, Bio/Technology 9:844 (1991); Palmiter et aL Cell, 
41: 343 (1985); Kraemer et aL, Genetic manipulation of the Mammalian Embryo, (Cold 
Spring Harbor Laboratory Press 1985); Hammer et aL, Nature, 315: 680 (1985); Wagner et 
-aL, U.S. Pat. No. 5,175,385; Krimpenfort et aL, U.S. Pat. No. 5,175,384, the respective 
contents of which are incorporated herein by reference 

Another method used to produce a transgenic animal involves microinjecting a nucleic acid 
into pro-nuclear stage eggs by standard methods. Injected eggs are then cultured before 
transfer into the oviducts of pseudopregnant recipients. 

Transgenic animals may also be produced by nuclear transfer technology as described in 
Schnieke, A.E. et aL, 1997, Science, 278: 2130 and Cibelli, J.B. et aL, 1998, Science, 280: 
1256. Using this method, fibroblasts from donor animals are stably transfected with a 
piasmid incorporating the coding sequences for a binding domain or binding partner of 
interest under the control of regulatory. Stable transfectants are then fused to enucleated 
oocytes, cultured and transferred into female recipients. 

Analysis of animals which may contain transgenic sequences would typically be performed 
by either PCR or Southern blot analysis following standard methods. 

By way of a specific example for the construction of transgenic mammals, such as cows, 
nucleotide constructs comprising a sequence encoding a DNA binding molecule are 
microinjected using, for example, the technique described in U.S. Pat No. 4,873,191, into 
oocytes which are obtained from ovaries freshly removed from the mammal The oocytes 
are aspirated from the follicles and allowed to settle before fertilization with thawed frozen 
sperm capacitated with heparin and prefractionated by Percoll gradient to isolate the motile 
fraction. 
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The fertilized oocytes are centrifuged, for example, for eight minutes at 15,000 g to 
visualize the pronuclei for injection and then cultured from the zygote to morula or 
blastocyst stage in oviduct tissue-conditioned medium. This medium is prepared by using 
5 luminal tissues scraped from oviducts and diluted in culture medium. The zygotes must be 
placed in the culture medium within two hours following microinjection. 

Oestrous is then synchronized in the intended recipient mammals, such as cattle, by 
administering coprostanol. Oestrous is produced within two days and the embryos are 
10 transferred to the recipients 5-7 days after estrous. Successful transfer can be evaluated in 
the offspring by Southern blot. 



Alternatively, the desired constructs can be introduced into embryonic stem cells (ES cells) 
and the cells cultured to ensure modification by the transgene. The modified cells are then 
15 injected into the blastula embryonic stage and the blastulas replaced into pseudopregnant 
hosts. The resulting offspring are chimeric with respect to the ES and host cells, and 
nonchimeric strains which exclusively comprise the ES progeny can be obtained using 
conventional cross-breeding. This technique is described, for example, in W09 1/1 0741. 



20 Production of transgenic plants 



Techniques for producing transgenic plants are well known in the art. Typically, either 
whole plants, cells or protoplasts may be transformed with a suitable nucleic acid construct 
encoding a DNA binding molecule or target DNA (see above for examples of nucleic acid 

25 constructs). There are many methods for introducing transforming DNA constructs into 
cells, but not all are suitable for delivering DNA to plant cells. Suitable methods include 
Agrobacterium infection (see, among others, Turpen et al, 1993, J. Virol. Methods, 42: 
227-239) or direct delivery of DNA such as, for example, by PEG-mediated 
transformation, by electroporation or by acceleration of DNA coated particles. Acceleration 

30 methods are generally preferred and include, for example, microprojectile bombardment. A 
typical protocol for producing transgenic plants (in particular moncotyledons), taken from 
U.S. Patent No. 5, 874 5 265 5 is described below. 
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An example of a method for delivering transforming DNA segments to plant cells is 
microprojectile bombardment. In this method, non-biological particles may be coated with 
nucleic acids and delivered into cells by a propelling force. Exemplary particles include 
those comprised of tungsten, gold, platinum, and the like. 

A particular advantage of microprojectile bombardment, in addition to it being an effective 
means of reproducibly stably transforming both dicotyledons and monocotyledons, is that 
neither the isolation of protoplasts nor the susceptibility to Agrobacterium infection is 
required. An illustrative embodiment of a method for delivering DNA into plant cells by 
acceleration is a Biolistics Particle Delivery System, which can be used to propel particles 
coated with DNA through a screen, such as a stainless steel or Nytex screen, onto a filter 
surface covered with plant cells cultured in suspension. The screen disperses the tungsten- 
DNA particles so that they are not delivered to the recipient cells in large aggregates. It is 
believed that without a screen intervening between the projectile apparatus and the cells to 
be bombarded, the projectiles aggregate and may be too large for attaining a high frequency 
of transformation. This may be due to damage inflicted on the recipient cells by projectiles 
that are too large. 

For the bombardment, cells in suspension are preferably concentrated on filters. Filters 
containing the cells to be bombarded are positioned at an appropriate distance below the 
macroprojectile stopping plate. If desired, one or more screens are also positioned between 
the gun and the cells to be bombarded. Through the use of techniques set forth herein one 
may obtain up to 1000 or more clusters of cells transiently expressing a marker gene 
("foci") on the bombarded filter. The number of cells in a focus which express the 
exogenous gene product 48 hours post-bombardment often range from 1 to 10 and average 
2 to 3. 



After effecting delivery of exogenous DNA to recipient cells by any of the methods 
discussed above, a preferred step is to identify the transformed cells for further culturing 
and plant regeneration. This step may include assaying cultures directly for a screenable 
trait or by exposing the bombarded cultures to a selective agent or agents. 
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An example of a screenable marker trait is the red pigment produced under the control of 
the R-locus in maize. This pigment may be detected by culturing cells on- a solid support 
containing nutrient media capable of supporting growth at this stage, incubating the cells 
at e.g., 18°C and greater than 180 uE m" 2 s"\ and selecting cells from colonies (visible 
5 aggregates of cells) that are pigmented. These cells may be cultured further, either in 
suspension or on solid media. 

An exemplary embodiment of methods for identifying transformed cells involves exposing 
the bombarded cultures to a selective agent, such as a metabolic inhibitor, an antibiotic, 
10 herbicide or the like. Cells which have been transformed and have stably integrated a 
marker gene conferring resistance to the selective agent used, will grow and divide in 
culture. Sensitive cells will not be amenable to further culturing. 

To use the bar-bialaphos selective system, bombarded cells on filters are resuspended in 
15 nonselective liquid medium, cultured (e.g. for one to two weeks) and transferred to filters 
overlaying solid medium containing from 1-3 mg/1 bialaphos. While ranges of 1-3 mg/1 
will typically be preferred, it is proposed that ranges of 0.1-50 mg/1 will find utility in the 
practice of the invention. The type of filter for use in bombardment is not believed to be 
particularly crucial, and can comprise any solid, porous, inert support. 

20 

Cells that survive the exposure to the selective agent may be cultured in media that 
supports regeneration of plants. Tissue is maintained on a basic media with hormones for 
about 2-4 weeks, then transferred to media with no hormones. After 2-4 weeks, shoot 
development will signal the time to transfer to another media. 

25 

Regeneration typically requires a progression of media whose composition has been 
modified to provide the appropriate nutrients and hormonal signals during sequential 
developmental stages from the transformed callus to the more mature plant. Developing 
plantlets are transferred to soil, and hardened, e.g., in an environmentally controlled 
30 chamber at about 85% relative humidity, 600 ppm C0 2 , and 250 uE m' 2 s" 1 of light. Plants 
are preferably matured either in a growth chamber or greenhouse. Regeneration will 
typically take about 3-12 weeks. During regeneration, cells are grown on solid media in 
tissue culture vessels. An illustrative embodiment of such a vessel is a petri dish. 
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Regenerating plants are preferably grown at about 19°C to 28°C. After the regenerating 
plants have reached the stage of shoot and root development, they may be transferred to a 
greenhouse for further growth and testing. 



5 Genomic DNA may be isolated from callus cell lines and plants to determine the presence 
of the exogenous gene through the use of techniques well known to those skilled in the art 
such as PCR and/or Southern blotting. 

Several techniques exist for inserting the genetic information, the two main principles 
10 being direct introduction of the genetic information and introduction of the genetic 
information by use of a vector system. A review of the general techniques may be found in 
- articles by Potrykus (Annu Rev Plant Physiol Plant Mol Biol . [1991] 42:205-225) and 
Ghristpu (Agro-Food-Industry Hi-Tech March/April 1 994 1 7-27). 

15 Thus, in one aspect, the present invention relates" to a vector system which carries a 
construct encoding a DNA binding molecule or target DNA according to the present 
invention and which is capable of introducing the construct into the genome of an 
organism, such as a plant. 

20 The vector system may comprise one vector, but it can comprise at least two vectors. In 
the case of two vectors, the vector system is normally referred to as a binary vector system. 
Binary vector systems are described in further detail in Gynheung An et ah (1980), Binary 
Vectors, Plant Molecular Biology Manual A3, 1-19. 

25 One extensively employed system for transformation of plant cells with a given promoter 
or nucleotide sequence or construct is based on the use of a Ti plasmid from 
Agrobacterium tumefaciens or a Ri plasmid from Agrobacterium rhizogenes (An et aL 
(1986), Plant Physiol 81, 301-305 and Butcher D.N. et al (1980), Tissue Culture Methods 
for Plant Pathologists, eds.: D.S. Ingrams and J.P. Helgeson, 203-208), 

30 

Several different Ti and Ri plasmids have been constructed which are suitable for the 
construction of the plant or plant cell constructs described above. 
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Examples of specific applications 

The DNA binding molecule/ target DNA/ DNA binding ligand combination may be used to 
regulate the expression of a nucleotide sequence of interest such as in a cell of an 
5 organism, including prokaryotes, yeasts, fungi, plants and animals, for example mammals, 
including humans. 

Nucleotide sequences of interest include genes associated with disease in humans and 
animals and therapeutic genes. Thus a DNA binding molecule may be used in conjunction 
10 with a target DNA sequence and DNA binding ligand in a method of treating or preventing 
disease in an animal or human patient. 

Alternatively, a genetic switch of the invention comprising a DNA binding molecule a 
target DNA sequence and a DNA binding ligand wherein the DNA binding ligand 
15 modulates binding of the DNA molecule to the target DNA may be used to regulate 
expression of a nucleotide sequence of interest in a plant. Examples of specific 
applications include the following: 

1 . Improvement of ripening characteristics in fruit. A number of genes have been 
20 identified that are involved in the ripening process (such as in ethylene biosynthesis). 

Control of the ripening process via regulation of the expression of those genes will help 
reduce significant losses via spoilage. 

2. Modification of plant growth characteristics through intervention in hormonal 
25 pathways. Many plant characteristics are controlled by hormones. Regulation of the genes 

involved in the production of and response to hormones will enable produce crops with 
altered characteristics. 

3. Improvement of other characteristics by manipulation of plant gene expression. 
30 Overexpression of the Na+/H+ antiport gene has resulted in enhanced salt tolerance in 

Arabidopsis. Targetted zinc fingers could be used to regulate the endogenous gene. 
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4. Improvement of plant aroma and flavour. Pathways leading to the production of 
aroma and flavour compounds in vegetables and fruit are currently being elucidated 
allowing the enhancement of these traits using gene switch technology. - 

5. Improving the pharmaceutical and nutraceutical potential of plants. Many 
pharmaceutical^ active compounds are known to exist in plants, but in many cases 
production is limited due to insufficient . biosynthesis in plants. Gene switch technology 
could be used to overcome this limitation by upregulating specific genes or biochemical 
pathways. Other uses include regulating the expression of genes involved in biosynthesis 
of commercially valuable compounds that are toxic to the development of the plant. 

6. Reducing harmful plant components. Some plant components lead to adverse 
allergic reaction when ingested in food. Gene switch technology could be used to overcome 
this problem by downregulating specific genes responsible for these reactions. 

7. As well as modulating the expression of endogenous genes, heterologous genes 
may be introduced whose expression is regulated by a gene switch of the invention. For 
example, a nucleotide sequence of interest may encode a gene product that is preferentially 
toxic to cells of the male or female organs of the plant such that the ability of the plant to 
reproduce can be regulated. Alternatively, or in addition, the regulatory sequences to 
which the nucleotide sequence is operably linked may be tissue-specific such that 
expression when induced only occurs in male or female organs of the plant. Suitable 
sequences and/or gene products are described in WO89/10396, WO92/04454 (the TA29 
promoter from tobacco) and EP-A-344,029, EP-A-412,006 and EP-A-412,91 1 . 

Other uses include regulating the expression of genes involved in biosynthesis of 
commercially valuable compounds that are toxic to the development of the plant. 

The present invention will now be described by way of the following examples, which are 
illustrative only and non-limiting. The examples refer to the figures: 
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Brief Description of the Figures 

Figure 1 shows a graph of the effect of Distamycin A concentration on binding of two 
different phage (clone 3 (3/2F) and clone 4 (4/5F)) to the DNA sequence AAAAAGGCG. 
5 In this case, the small molecule causes phage binding to DNA.. 

Figure 2 shows a graph of the effect of Actinomycin D concentration on binding of two 
different phage (AD clone 1 and 6) to the DNA sequence AGCTTGGCG. In this case, the 
small molecule causes phage binding to DNA.. 

10 

Figure 3 shows four different phage (0.4/1, 0.4/2, 0.4/4 and 0.4/5) binding to the 
randomised DNA oligo YRYRYGGCG (where Y is C or T and R is G or A) in the 
presence, but not in the absence, of echinomycin (EM). 

15 Figure 4 shows the binding site signature of phage 0.4/4 selected using the randomised 
DNA sequence (Y1)(R2)(Y3)(R4)(Y5)GGCG. The phage has a preference for the DNA 
sequence (T)(G/A)(C)(G/A)(T) in the presence of echinomycin. 

Figure 5 shows binding of the phage 0.4/4 to three related DNA sequences, 
20 TACGTGGCG, TGTATGGCG and CGTACGGCG, as a function of echinomycin 
concentration. The first DNA site contains the optimal binding sequence as revealed by the 
binding site signature. 

Figure 6 shows a graph of the effect of ligand concentration on binding of two different 
25 phage to specific DNA sequences. In this case, the respective phage are dissociated from 
the DNA in the presence of distamycin A or actinomycin D. 

Examples 

30 Example 1 - Preparation and Screening of a Zinc Finger Phage Display Library 

Selection Of Zinc Finger Phage Binding DNA Targets In The Presence Of Small 
Molecules 
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Example 1.1 Selection of Zinc Finger Phage that Bind DNA In The Presence Of 
DistamycinA 

5 A powerful method of selecting DNA binding proteins is the cloning of peptides (Smith 
(1985) Science 228, 1315-1317), or protein domains (McCafferty et al, (1990) Nature 
348:552-554; Bass et al, (1990) Proteins 8:309-314), as fusions to the minor coat protein 
(pill) of bacteriophage fd, which leads to their expression on the tip of the capsid. A phage 
display library is created comprising variants of the middle finger from the DNA binding 
10 domain of Zif268. 

Materials And Methods 

Construction And Cloning Of Genes. 

15 In general, procedures and materials are in accordance with guidance given in Sambrooker 
aL, Molecular Cloning. A Laboratory Manual, Cold Spring Harbor, 1989. The gene for 
the Zif268 fingers (residues 333-420) is assembled from 8 overlapping synthetic 
oligonucleotides (see Choo and Klug, (1994) PNAS (USA) 91:11163-67), giving Sfil and 
Notl overhangs. The genes for fingers of the phage library are synthesised from 4 

20 oligonucleotides by directional end to end ligation using 3 short complementary linkers, 
and amplified by PCR from the single strand using forward and backward primers which 
contain sites for Notl and Sfil respectively. Backward PCR primers in addition introduce 
Met-Ala-Glu as the first three amino acids of the zinc finger peptides, and these are 
followed by the residues of the wild type or library fingers as required. Cloning overhangs 

25 are produced by digestion with Sfil and Notl where necessary. Fragments are ligated to 1 
ug similarly prepared Fd-Tet-SN vector. This is a derivative of fd-tet-DOGl 
(Hoogenboom et al., (1991) Nucleic Acids Res. 19, 4133-4137) in which a section of the 
pelB leader and a restriction site for the enzyme Sfil (underlined) have been added by 
site-directed mutagenesis usin^ the oligonucleotide: 



5' CTCCTGCAGTTGGACCTGTGCCAT GGCCGGCTGGGC CGCATAGAATGG 
AACAACTAAAGC 3' (Seq ID No. 1) 
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which anneals in the region of the polylinker. Electrocompetent DH5a cells are 
transformed with recombinant vector in 200ng aliquots, grown for 1 hour in 2xTY medium 
with 1% glucose, and plated on TYE containing 15 jig/ml tetracycline and 1% glucose. 

5 The zinc finger phage display library of the present invention contains amino acid 
randomisations in putative base-contacting positions from the second and third zinc fingers 
of the three-finger DNA binding domain of Zif268, and contains members that bind DNA 
of the sequence XXXXX GGCG where X is any base. Further details of the library used 
may be found in WO 98/53057, which is incorporated herein by reference. The DNA 
10 sequences A AAAAA GGCG and A A A A A A GGC G A AAA A A are used as selection targets 
in this example because short runs of adenines can cause intrinsic DNA bending - 
moreover, the structure of the bend can be disrupted by binding of the antibiotic 
distamycin A. 

1 5 Phage Selection. 

Bacterial colonies containing zinc finger phage libraries are transferred from plates to 
200ml 2xTY medium (16g/litre Bactotryptone, lOg/litre Bactoyeast extract 5g/litre NaCl) 
containing 50 pM ZnCb and 15 ng/ml tetracycline. Bacterial cultures are grown overnight 
at 30°C. Culture supernatant containing phages is obtained by centrifuging at 300xg for 5 
20 minutes. 

Phage selection is over 4 rounds. Before each round, a pre-selection step is included 
comprising binding of 10 pmol of biotinylated DNA target sites immobilised on 50mg 
streptavidin coated beads (Dynal) to 1 ml of phage solution (bacterial culture supernatant 

25 diluted 1:1 with PBS containing 50 uM ZnCl 2 , 4% Marvel, 2% Tween), for 1 hour at 20°C 
on a rolling platform. After this time, 0.5 ml of phage solution is transferred to a 
streptavidin coated tube and incubated with 2 pmol biotinylated DNA target site in the 
presence of 2 uM distamycin A (Sigma) and 4 ug poly [d(I-C)]. After a one hour 
incubation the tubes are washed 20 times with PBS containing 50 uM ZnCb and 1% 

30 Tween, and 3 times with PBS containing 50 \xM ZnCb. Phage are eluted using 0.1ml 0.1M 
triethylamine and the solution is neutralised with an equal volume of 1M Tris-Cl (pH 7.4). 
Logarithmic-phase E. coll TGI cells are infected with eluted phage, and grown overnight, 
as described above, to prepare phage supernatants for subsequent rounds of selection. 
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10 



After 4 rounds of selection, bacteria are plated and phage prepared from 96 colonies are 
screened for binding to the DNA target site in the presence and absence of distamycin A. 
Binding reactions are carried out in wells of a streptavidin-coated microtitre plate 
(Boehringer Mannheim) and contain 50 ul of phage solution (bacterial culture supernatant 
diluted 1:1 with PBS containing 50 \xM ZnCb, 4% Marvel, 2% Tween), 0.15 pmol DNA 
target site and 0.25 ug poly [d(I-C)]. When added, distamycin A is present at a 
concentration of 2 liM. After a one hour incubation the wells are washed 20 times with 
PBS containing 50 (iM ZnCb and 1% Tween (and also distamycin A at a concentration of 
2 (.lM where appropriate), and 3 times with PBS containing 50 pM ZnCb. Bound phage are 
detected by ELISA (carried out in the presence of distamycin A at a concentration of 2 uM 
where appropriate) with horseradish peroxidase-conjugated anti-M13 IgG (Pharmacia 
Biotech) and quantitated using SOFTMAX 2.32 (Molecular Devices). 



1 5 Sequencing Of Selected Phage. 

Single colonies of transformants obtained after four rounds of selection as described, are 
grown overnight in 2xTY/Zn/Tet. Small aliquots of the cultures are stored in 15% glycerol 
at -20°C, to be used as an archive. Single-stranded DNA is prepared from phage in the 
culture supernatant and sequenced using the Sequenase™ 2.0 kit (U.S. Biochemical 

20 Corp.)- The amino acid sequences of the zinc finger clones are deduced. 

Amino acid sequences from helical regions of zinc fingers selected to bind DNA in the 

, presence of distamycin 



25 



35 



Fl 



F2 



F3 





-11234 56 


-11234 5 6 


-1123456 


Clone 1 


RSDELTR 


RSDDLST 


TNNTRIK 


Clone 2 


RSDELTR 


RSDDLST 


HKATRIK 


Clone 3 


RSDELTR 


RSDDLST 


TDKVRKK 


Clone 4 


RSDELTR 


RSDDLST 


HNASRIN 



1 
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Clone5 RSDELTR RSDDLSV TNNSRKK 

Clone 6 RSDELTR RSDDLST TNATRKK 

5 Clone 7 RSDELTR RSDDLSQ TRNTRKN 

Clone 8 RSDELTR RSDDLSV TNNSRKN 

10 Clones 1-4 were selected to bind the oligo: 

tata A A A A A A GGCGTG tcacagtcagtccacacgtc 

Clones 5-8 were selected to bind the oligo: 
tgfg A A A A A AGGr.GA A A A AA tcacaetcagtccacacgtc 

Zinc finger phage clones are isolated according to this method which bind the target with 
higher affinity in the presence of ligand than in the absence of ligand (see Figure 1). This 
method also selected certain clones that bound DNA in the absence of the ligand but were 
displaced from the DNA in the presence of the ligand (see Example 1 .4 below). 

20 

Example 1.2 - Selection of Zinc Finger Phage Binding DNA In The Presence of 
Actinomycin D 

An adaptation to the method outlined in the Example 1.1 was used to isolate phage that 
25 bound DNA in the presence of a different small molecule, actinomycin D. In this example 
the DNA target was AGCTTGGCG. 



15 



Phage Selection 

30 Essentially the method was the same as used in the previous section using four rounds of a 
preselection step followed by a selection step, washing and elution. Differences in the 
method are described. The preselection step comprised of 7.5 pmol of biotinylated DNA 
target site immobilised on 18.75 ul streptavidin coated beads (Dynal) in a 100 ul mixture 
containing 4 ul phage library 96 ul PBS, 2% Marvel, 1% Tween-20, 50 uM ZnCl 2 for 1 

3 5 hour at room temperature with constant mixing. Phage selections were made in streptavidin 
coated tubes with the phage supernatant, 5 nM biotinylated target DNA, 10 uM 
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actinomycin D in the presence of 1 \xg poly [d(I-C)] competitor. The selections were 
incubated for 1 hour at room temperature. The bound phage were washed and eluted as 
described above. 



5 ELISA was performed as described above but using 5 nM biotinylated target DNA, 0.25 ug 
polyfd(I-C)] competitor in the assay and 10 jjM actinomycin D where appropriate. Phage 
were sequenced' using Big Dye Terminator Cycle Sequencing Kit (Perkin Elmer 
Biosystems) and automated sequencing. 

10 clone 1 RSDELTRHIRIH RSDTLSVHIRTH 
clone 6 RSDELTRHIRIH RSDHLSVHIRTH 

These two clones were selected using the oligo: 
tatacaAGCTTGGCGatcacagtcagtccacacgtc 

15 

These zinc finger clones bind to the target oligo with higher affinity in the presence of 
actinomycin D than in the absence of DNA binding ligand (see Figure 2). 

Example 1.3 - Selection of Zinc Finger Phage Using Randomised DNA In The Presence Of 
20 Echinomycin, And Subsequent Deconvolution of Binding Partners 

In this experiment the library of DNA binding molecules was sorted using a library of 
DNA sequences in the presence of a small molecule. After DNA binding molecules that 
bound to DNAs in the presence of the small molecule had been selected, the optimal 
25 binding site(s) for each DNA binding molecule were determined using the binding site 
signature. 



HMAHRKTHTKIH 
KKFAHSAHRKTHTKIH 



a) Selections 

In this experiment, 50 pmol of DNA target library of sequence YRYRYGGCG (where Y is 
30 C or T and R is G or A) was bound to 125 p.1 of streptavidin coated beads (Dynal) and the 
beads were used to preselect 0.4 \\X of phage library in 100 |il of PBS, 2% Marvel, 1% 
Tween-20, 50'jo.M ZnCh for 1 hour at room temperature with constant mixing. Phage 
selections were made in streptavidin coated tubes with the phage supernatant, 30 nM 



P006552GP r> 



-74- 

biotinylated target DNA, 10 j^M echinomycin in the presence of 1 |ug poly [d(T-C)] 
competitor. The selections were incubated for 1 hour at room temperature. The bound 
phage were washed and eluted as described above. 



5 ELISA was performed as described above but using 30 nM biotinylated target DNA, 0.5 ug 
poly[d(I-C)] competitor in the assay and 10 |uM echinomycin where appropriate. Phage 
were sequenced using Big Dye Terminator Cycle Sequencing Kit (Perkin Elmer 
Biosystems) and automated sequencing. 



10 Four different clones were selected using the DNA library tatagtYRYRYGGCG 
atcacagtcagtccacacgtc in the presence of echinomycin (see Figure 3). 



clone 0 . 4/1 RSDELTRHIRIH RSDHLSKHIRTH KKFARSQTRINHTKIH 

clone 0 . 4/2 RSDELTRHIRIH RSDHLSEHIRTH * TRNARTKHTKIH 

15 clone 0.4/4 RSDELTRHIRIH RSDHLSNHIRTH RNDTRKTHTKIH 

clone 0 .4/5 RSDELTRHIRIH RSDNLSTHIRTH KKFAHSNTRKMHTKIH 



b) Binding site signature 



20 The signature of the clone 0.4/4 was determined using a modified binding site signature 
assay. For each of the 5 randomised positions of the oligo, a base was fixed at one of the 
five positions whilst the remaining 4 positions contained defined mixtures of bases. For the 
pyrimidine position the base was fixed as either C or T and for the purine position the base 
was fixed as either G or A so that by testing each position in turn an optimal sequence or 

25 binding site signature could be determined. 

In each well of a streptavidin-coated microtitre plate 2 (il of phage solution (overnight E. 
coli culture supernatant containing phage) were mixed with 48 jal of 2% Marvel 1% 
Tween-20, 0.5 jig poly [d(I-C)], 10 pM echinomycin and between 8-16 nM of biotinylated 
30 target DNA. The reaction was incubated for 1 hour at room temperature, followed by 6 
washes with PBS containing 1% Tween-20,- 50 jjM ZnCl 2 and 3 washes with PBS 
containing 0.05% Tween-20, 50 [iM ZnCl 2 . 100 |xl of PBS containing 1% Marvel, 0.05% 
Tween-20, 50 pM ZnCl 2 and 1/5000 dilution of anti-M13 horse radish peroxidase antibody 
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conjugate (Amersham Pharmacia Biotech) was added to each well and incubated for 1 hour 
at room temperature. The ELISA plate was washed 3 times with PBS containing 0.05% 
Tween-20, 50 uM ZnCl 2 followed by three washes with 3 washes of PBS containing 50 
uM ZnCl 2 . The assay was developed with BCIP/NBT substrates and quantified using a 
5 plate reader. 

This method determined the binding site sequence of clone 0.4/4 to be 
(T,)(G/A 2 )(C3)(G/A 4 )(T 5 ) (see figure 4). 

0 c) Verification of the target DNA sequence 

The optimal target DNA sequence, as determined by the binding site signature, was 
synthesised together with two other related DNA sequences that were present in the 
original random DNA library but differed in some of the optimal base positions of the 
> binding site. 

These oligonucleotides had the sequence: 
tatagtTACGTGGCGatcacagtcagtccacacgtc 
tatagtTGTATGGCGatcacagtcagtccacacgtc 
tatagtCGTACGGCGatcacagtcagtccacacgtc 

Binding of the phage clone was tested as a function of DNA concentrations (from 5 nM to 
0.312 nM) in the presence of 10 jliM echinomycin. A phage ELISA was set up using 20 ^1 
phage supernatant, 0.5 ng poly[d(jl-C)], 10 uM echinomycin in PBS containing 1% Marvel, 
1% Tween-20, 50 uM ZnCl 2 . The total volume of the assay was 50 ul. The assay was 
washed and developed as described as for the binding site signature assay. 

This method showed that the clone 0.4/4 bound preferentially to the sequence determined 
from the binding site signature, i.e. TACGTGGCG, in the presence of the small molecule 
(see Figure 5). 
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Example 1.4 Selection of Zinc Finger Phage that are dissociated from their DNA Targets 
In The Presence of Distamycin A or Actinomycin D 

This example describes phage that bound DNA targets with higher affinity in the absence 
5 of ligand. These phage were isolated using either: (a) the same method as in example 1.1, 
or (b) by selection in the absence of small molecule and phage elution from DNA using a 
small molecule. 

In this latter case (b) the method was as follows. 

10 Phage selection is over 4 rounds. Binding reactions contain 10 pmol biotinylated DNA site 
immobilised on 50mg streptavidin coated beads (Dynal) and a 1 ml solution of zinc finger 
phage library (as described in 1.1) Reactions were incubated for 1 h on a rolling platform. 
After this time, beads were washed 20 times as described in 1.1 and finally phage were 
eluted from the beads over 5 minutes using a solution containg ligand (10 [iM 

15 Distamycin A, or 1 jliM Actinomycin D in PBS/Zn). 

Some phage isolated by either of the above methods (a or b) bound DNA in the absence of 
ligand but could be displaced by concentrations of distamycin A at 10 jxM and 
actinomycin D at 1 pM. The distamycin sensitive clone was selected using the DNA target 
20 A A A A A G C G G A A A A A and its helices were sequenced as: 

QSRSLIQ QRDSLSR RSDERKR 

The actinomycin D sensitive clone was selected with the DNA target AGCTTGGCG and 
25 its helices were sequenced as: 

RSDELTR RSDVLST TRSSRKK 

Figure 6 demonstrates the sensitivity of each clone to the respective drug. 

30 
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Example 2 - Modulation Of Binding Of Polypeptides To Target DNA By DNA 
Binding Ligand 

Individual phage clones are assayed for modulation of target DNA binding by ligand in a 
phage ELISA binding assay. 

Binding assay reactions are carried out in wells of a streptavidin-coated microtitre plate 
(Boehringer Mannheim) as in : - Example 1, except that the distamycin concentration is 
varied while the DNA concentration is kept constant at 2 nM. 

Induction of higher affinity DNA binding is observed when distamycin is added to the 
binding reaction at 10" 6 M - 10 _7 M. 

Binding of the zinc finger phage to DNA in the absence of ligand, or at ligand 
concentrations of 10" 9 M or lower, results in phage retention close to background level, i.e. 
lower affinity binding than in tile presence of ligand. 

Background level affinity binding is defined as the phage retention in binding reactions that 
contain no DNA binding site. 

Example 3 - DNA-Ligand Modulatable Restriction Enzyme 

Phage-selected or rationally designed zinc finger domains which bind target DNA 

i 

sequences in a manner modulatable by a DNA binding ligand can be converted to 
restriction enzymes which cleave DNA containing said target sequences in a mariner 
modulatable by DNA binding ligand. This is achieved by coupling an appropriate zinc 
finger, . as isolated in Example il above, to a cleavage domain of a restriction enzyme or 
other nucleic acid cleaving moiety. 

A method of converting zinc finger DNA binding domains to chimaeric restriction 

I 

endonucleases has been described in Kim, et al. 9 (1996) Proc. Natl. Acad. Sci. USA 
93:1156-1160. In order to demonstrate the applicability of DNA ligand-modulatable zinc 
fingers to restriction enzymes, k fusion is made between the catalytic domain of Fok I as 
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described by Kim et aL and a zinc finger of Example 1 . Fusion of the zinc finger nucleic 
acid-binding domain to the catalytic domain of Fok I restriction enzyme results in a novel 
endonuclease which cleaves DNA adjacent to the DNA recognition sequence of the zinc 
finger ( A AAAAA GGCG or A AAAAA GGCGAAAAAA). 

5 

The oligonucleotides A AAAAA GGCG and A AAAAA GGCGAAAAAA are synthesised 
and ligated to arbitrary DNA sequences. After incubation with the zinc finger restriction 
enzyme, the nucleic acids are analysed by gel electrophoresis. Bands indicating cleavage 
of the nucleic acid at a position corresponding to the location of the oligonucleotide(s) 
10 (AAAAAAGGCG / A AAAAA GGCGAAAAAA) are visible. 

In a further experiment, the zinc finger is fused to an amino terminal copper/nickel binding 
motif Under the correct redox conditions (Nagaoka, M, et ai 9 (1994) J. Am. Chem. 
Soc. 116:4085-4086), sequence-specific DNA cleavage is observed, only in the presence 
15 of DNA incorporating oligonucleotide A AAAAA GGCG or A AAAAA GGCGAAAAAA. 

Example 4 - Modulation Of Transcriptional Activity In Vivo 

A reporter system is produced which produces a reporter signal conditionally depending on 
20 the binding of the zinc finger DNA binding molecule to its target DNA sequence. This 
binding, and hence transcription from the reporter system, is modulated by the DNA 
binding ligand Distamycin A. 

A transient transfection system using zinc finger transcription factors is produced as 
25 described in Choo, Y., etaL, (1997) J. Mol. Biol 273:525-532. This system comprises an 
expression plasmid which produces a phage-selected zinc finger fused to the activation 
domain of HSV VP 16, and a reporter plasmid which contains the recognition sequence of 
the zinc finger upstream of a CAT reporter gene. 

30 Thus, a zinc finger which recognises the DNA sequence AAAAAAGGCG is selected by 
phage display as described in Example 1. By the method of the preceding examples, said 
zinc finger is used to construct transcription factors as described above. 
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A transient expression experiment is conducted, wherein the CAT reporter gene on the 
reporter plasmid is placed downstream of the sequence A AAAAA GGCG. The reporter 
plasmid is cotransfected with ;a plasmid vector expressing the zinc finger-HSV fusion 
under the control of a constitutive promoter. No activation of CAT gene expression is 
observed. } 

However, when the same experiment is conducted in the presence of Distamycin A, CAT 
expression is observed as a result of the binding of the zinc finger transcription factor to its 
recognition sequence A AAAAA GGCG. 

Example 5 - Isolation of cognate target nucleic acids 



Using a known DNA binding molecule, target DNA sequences to which it can bind are 
isolated. 

The 434 repressor is a gene regulatory protein of phage 434. It binds to a 14bp operator 
site (see Koudelka et al, 1987 v Nature vol 326 pp 886-888). This operator site consists of 
five conserved bp (1-5), then four variable bp (6-9) 5 then five more conserved bp (10-14) as 
shown below: 
20 : 

Site: 1 5 6 7 8 9 10 14 

Base: A C A A G/T X ;X X X A/T T T G T 
wherein X is any base. 

25 The conserved bases contact the 434 repressor protein. The four variable bases are thought 
not to contact the 434 repressor protein. However, the four bases which do not contact the 
434 repressor protein may affect the affinity of binding of the repressor to the operator site. 

The 434 repressor protein (ie. the DNA binding molecule) is contacted with a library of 
30 different target DNA sequences in the presence and absence of ligand: 



The target DNA sequences a£e synthesized using an Applied Biosystems 3 80 A DNA 
synthesizer and are purified by gel electrophoresis. The four variable bases ('X' as shown 
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above) are randomised, producing a library of 256 different target DNA molecules, 
position 5 being T, and position 10 being A. At the 5" and 3' ends of this sequence are 
placed PCR primer sequences for amplification and recovery of the central target 
sequences. 

5 

Structure of target DNA sequence library: 

5' 1 6 9 14 3' 

GTCGGATCCTGTCTGAGGTGAG ACAATXXXXATTGT GTCTTCCGACGTCGAATTCGCG 

10 wherein X is any base, and the partially randomised 434 operator is underlined. 

The 434 repressor protein is added to the library of target DNA sequences, in the presence 
and absence of 2 uM distamycin A (Sigma) ligand in 200 ul binding buffer (9 mM Tris- 
HC1 pH 8.0, 90 mM KC1, 90 uM ZnS0 4 ) and incubated for 30 min. " 

15 

Nitrocellulose filters (BA 85, Schleicher and Schull) are placed into a suction chamber (as 
in Thiesen et al. (eds), Immunological Methods vol IV, Academic Press, Orlando) and 
prewet with 600 ml Tris-HCl binding buffer. The protein-oligonucleotide mix is applied to 
the filter(s) with gentle suction, the filters are washed with 4 ml Tris-HCl binding buffer. 
20 Oligonucelotides are eluted in 200 ul binding buffer plus;l mM 1-10-o-phenanthroline. 

Oligonucleotides are then amplified by PCR, using the following primers: 

Primer A 5'-GTCGGATCCTGTCTGAGGTGAG-3 ' 
25 Primer B 5'-CGCGAATTCGACGTCGGAAGAC-3' 

using an amplification kit (Perkin Elmer Cetus) with the following cycling regime: 
93°C 30 sec; 45°C 120 sec; 45°C to 67°C ramp 60 sec; 67°C 180 sec for 25 cycles. 
1 of eluted oligonucleotide material is used as template. 

30 

Optionally, the PCR amplified DNA product is then used in further rounds of incubation 
with the 434 repressor protein, nitrocellulose filter binding, oligonucleotide elution and 
PCR amplification. 
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PCR amplified DNA products are then sequenced using standard techniques. 




Target DNA sequences are selected which bind the 434 repressor with higher affinity in the 
5 presence of ligand than in the absence of ligand. Furthermore, DNA sequences are selected 
which bind the 434 repressor jjn the absence of ligand with a higher affinity than in the 
presence of ligand. 

Example 6 - Isolation of ligands which affect the binding of a DNA binding molecule 
10 to its cognate DNA target 1 

The 434 repressor protein of Example 5 is used in conjunction with a target operator DNA 
sequence to which it binds. I 

1 5 The operator sequence used is ^ 

5'-ACAATAAATATT<3T-3' 

A library of DNA binding ligands is used in place of the 2 |iM distamycin A (Sigma) DNA 
binding ligand of Example 5. 
20 ^ 

Ligands are isolated which are capable of increasing the affinity of the 434 repressor for its 
cognate DNA target sequence. 1 Ligands are also isolated which are capable of decreasing 
the affinity of the 434 repressorffor its cognate DNA target sequence. 

t! 

25 Example 7 - Generation of Transgenic Plants Expressing a Zinc Finger Protein Fused 
to a Transactivation Domain 1 

t 

] 

To investigate the utility of heterologous zinc finger proteins for the regulation of plant 
genes, a synthetic zinc finger protein was designed and introduced into transgenic 
30 Arabidopsis thaliana under the; control of a promoter capable of expression in a plant as 
described below. A second construct comprising the zinc finger protein binding sequence 
fused upstream of the Green Fluorescent Protein (GFP) reporter gene was also introduced 
into transgenic Arabidopsis thaliana as described in Example 8. Crossing the two 
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transgenic lines produced progeny plants carrying both constructs in which the GFP 
reporter gene was expressed demonstrating transactivation of the gene by the zinc finger 
protein. 

5 Using conventional cloning techniques, the following constructs were made as XbaL 
BamHI fragments in the cloning vector pcDNA3.1 (Invitrogen). 

pTFIIIAZifVP16 

10 pTFIIIAZifVP16 comprises a fusion of four finger domains of the zinc finger protein 
TFTIIA fused to the three fingers of the zinc finger protein Zif268. The TFIIIA-derived 
sequence is fused in frame to the translational initiation sequence ATG. The 7 amino acid 
Nuclear Localization Sequence (NLS) of the wild-type Simian Virus 40 Large T-Antigen is 
fused to the 3' end of the Zif268 sequence, and the VP 16 transactivation sequence is fused 

15 downstream of the NLS. In addition, 30 bp sequence from the c-myc gene is introduced 
downstream of the VP 16 domain as a "tag" to facilitate cellular localization studies of the 
trangene. While this is experimentally useful, the presence of this tag is not required for 
the activation (or repression) of gene expression via zinc finger proteins. 

20 The sequence of pTFIIIAZifVP 1 6 is shown in SEQ ID No. 1 as an Xbal-BamHI fragment. 
The translational initiating ATG is located at position 15 and is double underlined. Fingers 
1 to 4 of TFIIIA extend from position 18 to position 416. Finger 4 (positions 308-416) 
does not bind DNA within the target sequence, but instead serves to separate the first three 
fingers of TFIIIA from Zif268 which is located at positions 417-689. The NLS is located 

25 at positions 701-722, the VP16 transactivation domain from positions 723-956, and the 
c-myc tag from positions 957-986. This is followed by the translational terminator TAA. 

P TFIIIAZifVP64 

30 P TFIIIAZifVP64 is similar to pTFIIIAZifVP 16 except that the VP64 transactivation 
sequence replaces the VP 16 sequence of pTFIIIAZifVP 16. 
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The sequence of pTFITIAZifVR64 is shown in SEQ ID No. 2 as an Xba.I-Bam.HI fragment. 
Locations within this sequence -are as for pTFIIIAZifVP 1 6 except that the VP64 domain is 
located at position 723-908 and the c-myc tag from positions 909-938. 

I 

Using conventional cloning techniques, the sequence 5'-AAGGAGATATAACA-3' is 
introduced upstream of the translational initiating ATG of both pTFIIIAZifVP 1 6 and 
pTFIIIAZifVP64. This sequence incorporates a plant translational initiation context 
sequence to facilitate translation in plant cells (Prasher et al Gene 111 : 229-233 (1992); 
Chalfie et al Science 263: 802-805 (1992)). 



The final constructs are transferred to the plant binary vector pBIN121 between the 
Cauliflower Mosaic Virus 35S ( promoter and the nopaline synthase terminator sequence. 
This transfer is effected using the Xbal site of pBIN12L The binary constructs thus derived 
are then introduced into Agrobkcterium titmefaciens (strain LB A 4044 or GV 3101) either 
1 5 by triparental mating or direct transformation. 

Next, Arabidopsis thaliana are transformed with Agrobacteriam containing the binary 
vector construct using conventional transformation techniques. For example, using 
vacuum infiltration {e.g. Bechtold et al CR Acad Sci Paris 316 : 1194-1199; Bent et al 
20 Science 265 : 1856-1860 (1994)), transformation can be undertaken essentially as follows. 
Seeds of Arabidopsis are planted on top of cheesecloth covered soil and allowed to grow at 
a final density of 1 per square inch under conditions of 16 hours light/8 hours dark. After 

4-6 weeks, plants are ready to j infiltrate. An overnight liquid culture of Agrobacterhtm 

i 

carrying the appropriate construct is grown up at 28°C and used to inoculate a fresh 500ml 
25 culture. This culture is grown to an OD600 of at least 2.0, after which the cells are 
harvested by centrifugation and resuspended in 1 litre of infiltration medium (1 litre 
prepared to contain: 2.2 g MS Salts, 1 X B5 vitamins, 50 g sucrose, 0.5 g MES pH 5.7, 
0.044 t uM benzylaminopurine, 200 L Silwet \xL-ll (OSI Specialty)). To vacuum infiltrate, 
pots are inverted into the infiltration medium and placed into a vacuum oven at room 
30 temperature. Infiltration is allowed to proceed for 5 mins at 400mm Fig. After releasing 
the vacuum, the pot is removed! and layed it on its side and covered with Saran wrap. The 
cover is removed the next day and the plant stood upright. Seeds harvested from infiltrated 
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plants are surface sterilized and selected on appropriate medium. Vernalizalizion is 
undertaken for two nights at around 4°C. Plates are then transferred to a plant growth 
chamber. After about 7 days, transformants are visible and are transferred to soil and 
grown to maturity. 

5 

Many transgenic plants are grown to maturity. They appear phenotypically normal and are 
selfed to homozygosity using standard techniques involving crossing and germination of 
progeny on appropriate concentration of antibiotoic. 

10 Transgenic plant lines carrying the TFIIIAZifVP 1 6 construct are designated 
y4r-TFIIIAZifVP16 and transgenic plant lines carrying the TFIIIAZifVP64 construct are 
designated ^f-TFIIIAZifVP64. 

Example 8 - Generation of Transgenic Plants Carrying a Green Fluorescent Protein 
15 Reporter Gene 

A reporter plasmid is constructed which incorporates the target DNA sequence of the 
TFIIIAZifVP 16 and TFIIIAZifVP64 zinc finger proteins described above upstream of the 
Green Fluorescent Protein (GFP) reporter gene. The target DNA sequence of 
20 TFIIIAZifVP16 and TFIIIAZifYP64 is shown in SEQ I.D. No. 3. This sequence is 
incorporated in single copy immediately upstream of the CaMV 35S -90 minimal promoter 
to which the GFP gene is fused. 

The resultant plasmid, designated pTFIIIAZif-UAS/GFP, is transferred to the plant binary 
25 vector pBIN121 replacing the Cauliflower Mosaic Virus 35S promoter. This construct is 
then transferred to Agrobacterium tumefaciens and subsequently transferred to Arabidopsis 
thaliana as described above. Transgenic plants carrying the construct are designated At- 
TFIIIAZif-UAS/GFP. 

30 Example 9 - Use of Zinc Finger Proteins to Up-Regulate a Transgene in a Plant 

To assess whether the zinc finger constructs TFIIIAZifVP16 and TFIIIAZifVP64 are able 
to transactivate gene expression in planta, Arabidopsis lines >lr-TFIIIAZifVP16 and 
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//r-TFIIIAZifVP64 are crossed to /I /-TFIIIAZif-UAS/GFP. The progeny of such crosses 
yield plants that carry the reporter construct TFIIIAZif-UAS/GFP together with either the 
^) zinc finger protein construct TFIIIAZifVP16 or the zinc finger construct TFIIlAZifVP64. 

5 . Plants are screened for GFP expression using an inverted fluorescence microscope (Leitz 
DM-IL) fitted with a filter set (Leitz-D excitation BP 355-425, dichronic 455, emission LP 
460) suitable for the main 395 rim excitation and 509 nm emission peaks of GFP. 

In each case, the zinc finger construct is able to transactivate gene expression 
10 demonstrating the utility of heterologous zinc finger proteins for the regulation of plant 
genes. 

I 

Example 10 - Generation of [Transgenic Plants Expressing a Zinc Finger Fused to a 
Plant Transactivation domain 

15 

The constructs P TFIIIAZifVP16 and P TFIIIAZifVP64 utilize the VP 16 and VP64 
transactivation domains of Herpes Simplex Virus to activate, gene expression. Alternative 
transactivation domains are various and include the CI transactivation domain sequence 
(from maize; see Goffer al\ Genes Dev. 5: 298-309 (1991); Goff et al\ Genes Dev. 6: 
20 864-875 (1992)), and a number of other domains that have been reported from plants (see 
Estruch etaL; Nucl. Acids Res. 22: 3983-3989 (1994)). 

i 

A 

Construct pTFIIAZifCl is made as described above for pTFITIAZifVP16 and 
pTFIIIAZifVP64 except the VP16/VP64 activation domains are replaced with the CI 

25 transactivation domain sequence 

i 

i 
I 
i 

A transgenic Arabidopsis line, designated /i/-TFIIAZifCl , is produced as described above 
in Example 8 and crossed with ^/-TFIIIAZif-UAS/GFP. The progeny of such crosses yield 
plants that carry the reporter construct TFIIIAZif-UAS/GFP together with either the zinc 
30 finger protein construct TFIIIAZifC 1 . 
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Plants are screened for GFP expression using an inverted fluorescence microscope (Leitz 
DM-IL) fitted with a filter set (Leitz-D excitation BP 355-425, dichronic 455, emission LP 
460) suitable for the main 395 nm excitation and 509 nm emission peaks of GFP. 

5 Example 11 - Regulation of an endogenous plant gene - UDP glucose flavonoid 
glucosyl-transferase (UFGT). 

To determine whether a suitably configured zinc finger could be used to regulate gene 
transcription from an endogenous gene in a plant, the maize UDP glucose flavonoid 

10 glucosyl-transferase (UFGT) gene (the Bronzel gene) was selected as the target gene. 
UFGT is involved in anthocyanin biosynthesis. A number of wild type alleles have been 
identified including Bz-W22 that conditions a purple phenotypes in the maize seed and 
plant. The Bronze locus has been the subject of extensive genetic research because its 
phenotype is easy to score and its expression is tissue specific and varied (for example 

15 aleurone, anthers, husks, cob and roots). The complete sequence of Bz-W22 including 
upstream regulatory sequences has been determined (Ralston et al, Genetics 119: 185- 
197). A number of sequence motifs that bind transcriptional regulatory proteins have been 
identified within the Bronze promoter including sequences homologous to consensus 
binding sites for the myb- and myc-like proteins (Roth et aL, Plant Cell 3: 3 17-325). 

20 

Identification of a zinc finger that binds to the bronze promoter 

The first step is to carry out a screen for zinc finger proteins that bind to a selected region 
of the Bronze promoter. A region is chosen just upstream of the AT rich block located at 
25 between -88 and -80, which has been shown to be critical for Bzl expression (Roth et al, 
supra). 

1. Bacterial colonies containing phage libraries that express a library of zinc fingers 
randomised at one or more DNA binding residues (see Example 1) are transferred from 

30 plates to culture medium. Bacterial cultures are grown overnight at 30°C. Culture 
supernatant containing phages is obtained by centrifugation. 

2. 10 pmol of biotinylated target DNA, derived from the Bronze promoter, 
immobilised on 50 mg streptavidin beads (Dynal) is incubated with 1 ml of the bacterial 



P006552GP 10 

-87- 

culture supernatant diluted 1:1 with PBvS containing 50 llM ZnCb, 4% Marvel, 2% Tween 
in a streptavidin coated tube for 1 hour at 20°C on a rolling platform in the presence of 
4 |xg poly [d(I-C)] as competitor. 

3. The tubes are. washed 2p times with PBS containing 50 ZnCb and 1% Tween, 
5 and 3 times with PBS containing 50 jiM ZnCb to remove non-binding phage. 

4. The remaining phage are eluted using 0.1 ml 0.1 M triethylamine and the solution is 
neutralised with an equal volume of 1 M Tris-Cl (pH 7.4). 

5. Logarithmic-phase E. coli TGI cells are infected with eluted phage, and grown 
overnight, as described above, to prepare phage supernatants for subsequent rounds of 

10 selection. 

6. Single colonies of transformants obtained after four rounds of selection (steps 1 



15 



to 5) as described, are grown overnight in culture. Single-stranded DNA is prepared from 
phage in the culture supernatknt and sequenced using the Sequenase™ 2.0 kit (U.S. 
Biochemical Corp.). The amino acid sequences of the zinc finger clones are deduced. 



Construction of a vector for expression of the zinc finger clone fused to a CI activation 
domain in maize protoplasts ; 



Using conventional cloning techniques and in a similar manner to Example 7, the construct 
20 pZifBz23Cl is made in cloning'vector pcDNA3.1 (Invitrogen). 

pZifBz23Cl comprises a the three fingers of the zinc finger protein clone ZifBz23 fused in 
frame to the translational initiation sequence ATG. The 7 amino acid Nuclear Localization 
Sequence (NLS) of the wild-type Simian Virus 40 Large T-Antigen is fused to the 3' end of 
25 the ZifBz23 sequence, and the CI transactivation sequence is fused downstream of the 
NLS. In addition, 30 bp sequence from the c-myc gene is introduced downstream of the 
VP 16 domain as a "tag" to facilitate cellular localization studies of the trangene. 

The coding sequences of pZifBz23Cl are transferred to a plant expression vector suitable 

! 

30 for use in maize protoplasts, thfe coding sequence being under the control of a constitutive 
CaMV 35S promoter. The resulting plasmid is termed pTMBz23. The vector also 

contains a hygromycin resistance gene for selection purposes. 

i 

i 
\ 
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A suspension culture of maize cells is prepared from calli derived from embryos obtained 
from inbred W22 maize stocks grown to flowering in a greenhouse and self pollinated 
using essentially the protocol described in EP-A-332104 (Examples 40 and 41). The 
suspension culture is then used to prepare protoplasts using essentially the protocol 
described in EP-A-332104 (Example 42). 

Protoplasts are resuspended in 0.2 M mannitol, 0.1% w/v MES, 72 raM NaCl, 70 mM 
CaCl 2 , 2.5 mM KC1, 2.5 mM glucose pH to 5.8 with KOH, at a density of about 2 x 10 6 per 
ml. 1 ml of the protoplast suspension is then aliquotted into plastic electroporation 
cuvettes and 10 ug of linearized P TMBz23 added. : Electroporation is carried out s 
described in EP-A-332104 (Example 57). Protoplasts are cultured following 
transformation at a density of 2 x 10 6 per ml in KM-8p medium with no solidifying agent 
added. 

15 Measurements of the levels UFGT expression are made using colorimetry and/or 
biochemical detection methods such as Northern blots or the enzyme activity assays 
described by Dooner and Nelson, Proc. Natl. Acad. Sci. 74: 5623-5627 (1977). 
Comparison is made with mock treated protoplasts transformed with a vector only control. 

20 Alternatively, or in addition to, analysing expression of UFGT in transformed protoplasts, 
intact maize plants may be recovered from transformed protoplasts and the extent of UFGT 
expression determined. Suitable protocols for growing up maize plants from transformed 
protoplasts are known in the art: Electroporated protoplasts are resuspended in Km-8p 
medium containing 1.2% w/v Seaplaque agarose and 1 mg/l 2,4-D. Once the gel has set, 

25 protoplasts in agarose are place in the dark at 26°C. After 14 days, clonies arise from the 
protoplasts. The agarose containing the colonies is transferred to the surface of a 9 cm 
diameter petri dish containing 30 ml of N6 medium (EP-A-332,104) containing 2,4-D 
solidified with 0.24% Gelrite®. 100 mg/l hygromycin B is also added to select for 
transformed cells. The callus is cultured further in the dark at 26°C and callus pieces 

30 subcultured every two weeks onto fresh solid medium. Pieces of callus may be analysed 
for the presence of the P TMBz23 construct and/or UFGT expression determined. 
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Corn plants are regenerated as described in Example 47 of EP-A-332,104. Plantlets appear 
in 4 to 8 weeks. When 2 cm tall, plantlets are transferred to ON6 medium (EP-A-332,104) 
1^ in GA7 containers and roots form in 2 to 4 weeks. After transfer to peat pots plants soon 
■ become established and can thejn be treated as normal corn plants. 

5 

Plantlets and plants can be assayed for UFGT expression as described above. 

Example 12 - Regulation of gene expression using a chemically inducible small 
molecule 

10 

The Zif268 Zinc finger phage display iibrary described in Example 1 is screened using the. 

bronze promoter sequence described in Example 1 1 and a library of small molecule 

candidate DNA binding ligands, prescreened to remove non-DNA binding molecules. The 

protocol used is essentially a modification of Example 1 but using multiple ligands. To 

15 increase the number of ligands in the screen, ligands are screened in groups of twenty.' 

Once zinc finger clones are identified that have ligand-dependent DNA binding, a single 

zinc finger clones is tested for ligand-dependent binding against each individual ligand in 

the mixture originally selected. In this way, a gene switch comprising a zinc finger, clone 

that binds to a region of the bronze promoter in a manner modulatable by a chemical 

20 ligand, the region of the bronze promoter and the chemical ligand itself is identified. 

i 

■ j 

The zinc finger clone is fused to a VP 16 transactivation domain and other relevant 
sequences as described in Example 7. The resulting construct, pZFSelectCl is transferred 
to the plant binary vector pBIN121 between the Cauliflower Mosaic Virus 35S promoter 
25 and the nopaline synthase terrrfinator sequence. The binary construct thus derived is then 
introduced into Agrobacterium tumefaciens (strain LB A 4044 or GV 3101) either by 
triparental mating or direct transformation. 

A 

I 
A 

A transgenic Arabidopsis line, designated ^r-ZFSelectCl, is produced as described above 
30 in Example 8. : 



A further transgenic Arabidopsis line, designated vl/-BzGUS is produced which comprises 
a reporter construct containing the E. coli beta-glucuronidase gene (GUS) fused to a -90 
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minimal 35S promoter to which is operably linked the bronze promoter sequence used in 
the tripartite screen. Arabidopsis lacks endogenous GUS activity. Further, GUS activity is 
very stable and expression can be measured accurately using flurometric assays of very 
small amounts of transformed plant tissue (see Jefferson et al, Embo J. 6: 3901-3907 
5 (1987)). 

^f-ZFSelectCl lines are crossed with >l/-BzGUS lines. The progeny of such crosses yield 
plants that carry the reporter construct BzGUS together with either the zinc finger protein 
construct ZFSelectCl . 

Plants are grown in a range of concentrations of the chemical ligand and GUS activity in 
leaf tissue measured as described in Jefferson et al., Embo J. 6: 3901-3907 (1987). GUS 
activity in non transgenic plants, ^-ZFSelectCl line and ^(r-BzGUS lines in the presence 
of the chemical ligand is also measured. 

Example 13 - Tripartite Screen for a zinc finger/target DNA and small molecule 
ligand and the use of the identified components in regulating gene expression 

A screen is performed as described in Example 12 except that the target DNA is a 
20 randomised library based on the Bronze promoter sequence and the procedure described in 
Example 1.3 is used to determine the binding site signature of identified clones once a 
ligand has been selected. Verification of the target DNA sequence is also performed as 
described in Example 1.3. 

25 A target DNA identified in the screen is introduced into a -90 minimal Ca35S-GUS 
reporter construct as described in Example 12 and used to produce a transgenic 
Arabidopsis line. A corresponding zinc finger clone is introduced into an expression 
construct as described in Example 12 and used to produce a transgenic Arabidopsis line. 
The two lines are crossed and progeny tested for induction, of GUS activity in the presence 
30 or absence of the ligand identified in the screen. 

All publications mentioned in the above specification are herein incorporated by reference. 
Various modifications and variations of the described methods and system of the invention 




10 



15 
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vvill be apparent to those skilled in the art without departing from the scope and spirit of the 
invention. Although the invention has been described in connection with specific preferred 
embodiments, it should be understood that the invention as claimed should not be unduly 
limited to such specific embodiments. Indeed, various modifications of the described 
modes for carrying out the invention which are obvious to those skilled in molecular 
biology or related fields are intended to be within the scope of the following claims. 



i 
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Sequence ID I: TFHIA/Zif-VP16 



T CTAGA GCGCCGCCATGGGAGAGAAGGCGCTGCCGGTGGTGTATAAGCGGTACATCTGCTC 
TTTCGCCGACTGCGGCGCTGCTTATAACAAGAACTGGAAACTGCAGGCGCATCTGTGCAAA 
5 CACACAGGAGAGAAACCATTTCCATGTAAGGAAGAAGGATGTGAGAAAGGCTTTACCTCGC 
TTCATCACTTAACCCGCCACTCACTCACTCATACTGGCGAGAAAAACTTCACATGTGACTC 
GGATGGATGTGACTTGAGATTTACTACAAAGGCAAACATGAAGAAGCACTTTAACAGATTC 
CATAACATCAAGATCTGCGTCTATGTGTGCCATTTTGAGAACTGTGGCAAAGCATTCAAGA 
AACACAATCAATTAAAGGTTCATCAGTTCAGTCACACACAGCAGCTGCCGTATGCTTGCCC 

10 TGTCGAGTCCTGCGATCGCCGCTTTTCTCGCTCGGATGAGCTTACCCGCCATATCCGCATC 
CACACAGGCCAGAAGCCCTTCCAGTGTCGAATCTGCATGCGTAACTTCAGTCGTAGTGACC 
ACCTTACCACCCACATCCGCACCCACACAGGCGAGAAGCCTTTTGCCTGTGACATTTGTGG 
GAGGAAGTTTGCCAGGAGTGATGAACGCAAGAGGCATACCAAAATCCATTTAAGACAGAAG 
GACGCGGCCGCACTCGAGCG GAATTC CGGCCCAAAAAAGAAGAGAAAGGTCGCCCCCCCGA 

15 CCGATGTCAGCCTGGGGGACGAGCTCCACTTAGACGGCGAGGACGTGGCGATGGCGCATGC 
CGACGCGCTAGACGATTTCGATCTGGACATGTTGGGGGACGGGGATTCCCCGGGGCCGGGA 
TTTACCCCCCACGACTCCGCCCCCTACGGCGCTCTGGATACGGCCGACTTCGAGTTTGAGC 
AGATGTTTACCGATGCCCTTGGAATTGACGAGTACGGTGGGGAACAAAAACTTATTTCTGA 

AGAAG AT CTGT AAGGATCC 

20 

Sequence ID 2: TFIIIA/Zif-VP64 

TC TAGA GCGCCGCCATGGGAGAGAAGGCGCTGCCGGTGGTGTATAAGCGGTACATCTGCTC 
T^TTCGCCGACTGCGGCGCTGCTTATAACAAGAACTGGAAACTGCAGGCGCATCTGTGCAAA 

25 CACACAGGAGAGAAACCATTTCCATGTAAGGAAGAAGGATGTGAGAAAGGCTTTACCTCGC 
TTCATCACTTAACCCGCCACTCACTCACTCATACTGGCGAGAAAAACTTCACATGTGACTC 
GGATGGATGTGACTTGAGATTTACTACAAAGGCAAACATGAAGAAGCACTTTAACAGATTC 
CATAACATCAAGATCTGCGTCTATGTGTGCCATTTTGAGAACTGTGGCAAAGCATTCAAGA 
AACACAATCAATTAAAGGTTCATCAGTTCAGTCACACACAGCAGCTGCCGTATGCTTGCCC 

30 TGTCGAGTCCTGCGATCGCCGCTTTTCTCGCTCGGATGAGCTTACCCGCCATATCCGCATC 
CACACAGGCCAGAAGCCCTTCCAGTGTCGAATCTGCATGCGTAACTTCAGTCGTAGTGACC 
ACCTTACCACCCACATCCGCACCCACACAGGCGAGAAGCCTTTTGCCTGTGACATTTGTGG 
GAGGAAGTTTGCCAGGAGTGATGAACGCAAGAGGCATACCAAAATCCATTTAAGACAGAAG 
GACGCGGCCGCACTCGAGCG GAATTC CGGCCCAAAAAAGAAGAGAAAGGTCGAACTTCAGC 

35 TGACTTCGGATGCATTAGATGACTTTGACTTAGATATGCTAGGATCTGACGCGCTAGACGA 
TTTCGATCTGGACATGTTGGGCAGCGATGCTCTAGACGATTTCGATTTAGATATGCTTGGC 
TCGGATGCCCTGGATGACTTCGACCTCGACATGCTGTCAAGTCAGCTGAGCCAGGAACAAA 
AACTTATTTCTGAAGAAGATCTGTAAGGATCC 



40 Sequence ID 3: TFIIIA/Zif binding site 

TgcgtgggcgTGTACCTggatgggagacC 
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1 . A method of selecting a gene switch, which gene switch comprises (i) a target DNA 
molecule; (ii) a DNA binding molecule which binds to the target DNA molecule in a 
manner modulatable by a DNA binding ligand; and (iii) the DNA binding ligand, which 
method comprises: 

(a) contacting one or more candidate target DNA molecule(s) with one or more 
candidate DNA binding molecules, in the presence of one or more DNA binding ligands, 
wherein at least one of the candidate DNA binding molecules comprises a non-naturally 
occurring DNA binding domain; 

(b) selecting a complex comprising a candidate target DNA, a DNA binding molecule 
and a DNA binding ligand; 

(c) isolating and/or identifying the unknown components of the complex; 

(d) comparing the binding of the DNA binding molecule component of the complex to 
the target DNA component of the complex in the presence and absence of the DNA 
binding ligand component of the complex; and 

(e) selecting complexes where said binding differs in the presence and absence of the 
DNA binding ligand component. 

2. A method according to claim 1 wherein the DNA binding molecules are provided 
as a plurality of DNA binding molecules. 

3. A method according to claim 2 wherein the DNA binding molecules are provided 
as a library of DNA binding mjolecules. 

4. A method according tqj any one of claims 1 to 3 wherein the target DNA is provided 

as a plurality of DNA sequences. 

i 

! 
j 

5. A method according to any one of claims 1 to 4 wherein the target DNA is provided 
as a library of DNA sequences, said sequences being related to one another by sequence 
homology. 
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6. A method according to any one of the preceding claims wherein a plurality of 
candidate DNA binding ligands are used. 

7. A method according to claim 6 wherein one target DNA sequence is used. 

8. A method according to claim 6 or claim 7 wherein one of the components isolated 
and/or identified in step (c) is a DNA binding ligand component. 

9. A method according, to any one of the preceding claims wherein one of the 
components isolated in step (c) is a DNA binding molecule component. 

10. A method according to any one of the precedir g claims wherein the DNA binding 
molecule component has a higher affinity for the target DNA in the presence of the DNA 
binding ligand component than in the absence of the DNA binding ligand component. 

11. A method according to any one of claims 1 to 9 wherein the DNA binding molecule 
component has a higher affinity for the target DNA in the absence of the DNA binding 
ligand component than in the presence of the DNA binding ligand component. 

12. The method according to any one of the preceding claims, wherein said candidate 
DNA binding molecules are polypeptides. 

13. The method according to claim 12, wherein said candidate DNA binding molecules 
are polypeptides at least partly derived from transcription factors. 

14. The method according to claim 13, wherein said candidate DNA binding molecules 
are derived from zinc finger transcription factors. 

15. A method according to any one of the preceding claims, wherein the candidate 
DNA binding molecules are provided as a phage display library. 

16. A method according to any one of the preceding claims, wherein the DNA binding 
ligand is selected from Distamycin A, Actinomycin D and echinomycin. 
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17. A gene switch comprising (i) a target DNA molecule; (ii) a DNA binding molecule 
which binds to the target DNA molecule in a manner modulatable by a DNA binding 
liaand; and (iii) the DNA binding ligand. 



18. Use of a DNA binding molecule selected by the method of any one of claims 1 to 
16 in a method of regulating transcription from a DNA sequence comprising a target DNA 
to which the DNA binding molecule binds in a manner modulatable by a DNA binding 

ligand. 

19. Use of a DNA binding ligand selected by the method of any one of claims 1 to 16 in 
a method of regulating transcription from a DNA sequence comprising a target DNA to 
which a DNA binding molecule binds in a manner modulatable by the DNA binding 
ligand, . 

20. Use of a target DNA selected by the method of any one of claims 1 to 16 in a 
method of regulating transcription from a DNA sequence comprising the target DNA to 
which a DNA binding molecule binds in a manner modulatable by a DNA binding ligand. 

21. A method of modulating the expression of one or more genes, said method 
comprising administering a DNA binding molecule and DNA binding, ligand selected 
according to the method of any one of claims 1 to 16 to a cell wherein the regulatory 
sequences of said genes comprise a target DNA selected according to the method of any 
one of claims 1 to 16. 

22. A method of modulating the expression of one or more nucleotide sequences of 
interest in a host cell which host cell comprises a nucleic acid sequence capable of 
directing the expression of i DNA binding molecule and a target DNA sequence to which 
the DNA binding molecule binds in a manner modulatable by a DNA binding ligand which 
method comprises administering said DNA binding ligand to the cell and wherein the DNA 
binding molecule is heterologous to the host cell. 

23. A method according to claim 21 or claim 22 wherein the host cell is a plant cell. 
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24 A method according to claim 23 wherein the plant cell is part of a plant and the 
taroet sequence is part of a regulatory sequence to which the nucleotide sequence of interest 
is operably linked, said regulatory sequence being preferentially active in the male or 
female organs of the plant. 

25 A non human transgenic organism comprising a target DNA sequence and a nucleic 
acid sequence capable of directing the expression of a DNA binding molecule which binds 
to the target DNA in a manner modulatable by a DNA binding ligand wherein the target 
DNA sequence and/or nucleic acid sequence are heterologous to the organism. 

26. A transgenic non-human organism according to claim 25 which is a plant. 
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ABSTRACT 

) OFN g .SWITCHES 

5 A method rs provided of selecting a gene switch, which gene switch comprises (i) a target 
DN A m o,ecule; (ii) a DNA biding rn.te.ule which binds ,o the targe, DNA moiecule ,n a 
m an„er modu.atabie by a DNA binding ligand; and (iii) the DNA binding iigand, wh.ch 

method comprises: 

(a) contacting one or more candidate target DNA molecule(s) with one or more 
,0 candidate DNA binding modules, in the presence of one or more DNA binding hgands, 

wherein at least one of the candidate DNA binding molecules comprises a non-naturaily 

occurring DNA binding domain; 

(b) selecting a compiex comprising a candidate target DNA, a DNA b.nding molecule 

and a DNA binding ligand; 
! 5 ( C ) isolating and/or identifying the unknown components of the complex; 

(d) comparing the binding of the DNA binding molecule component of the complex to 

t ■ n f the cnmolex in the presence and absence of the DNA 
the target DNA component of the complex me y 

binding ligand component of the complex; and 

(e) selecting complexes Where said binding drfiers in the presence and absence of the 
20 DNA binding ligand component. 

i 
! 
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